An individual responsible for developing, deploying, managing, and monitoring data pipelines that ingest and transform data from one or more sources to one or more targets.
Added Perspectives
Data engineers do the heavy lifting required to create and manage the information supply chain. We used to call these individuals ETL developers and data architects. They identify source data, map data flows, model databases, define and monitor data transformation jobs, and work with database administrators to create, manage, and tune databases and optimize performance. Some also design business views for business users, especially if they are built within a database.
- Wayne Eckerson in A Reference Architecture for Self-Service Analytics
September 1, 2016 (Report)
Data engineering is a critically important part of analytics that receives little attention compared to data science. Recent research shows 12 times as many unfilled data engineer jobs as data scientist positions. Breadth and depth of required skills limits the number of qualified people to work as data engineers. Clearly the demand for data engineers outstrips the supply, and the gap continues to grow. The large number of unfilled jobs reflects the complexity of data engineering. Breadth of knowledge ranges from relational databases to NoSQL, from batch ETL to data stream processing, and from traditional data warehousing to data lakes. Depth of skills includes hands-on work with Hadoop; programming in Java, Python, R, Scala, or other languages; and data modeling from relational and star-schema to document stores and graph databases. The data engineer is part database engineer (building the databases that implement data warehouses, data lakes, and analytic sandboxes) and part software engineer (building the processes, pipelines, and services that move data through the ecosystem and make it accessible to data consumers). One goal of data fabric is to automate much of data engineering to increase reuse and repeatability, and to expand data engineering capacity.
- Dave Wells in Data Fabric Smart Data Engineering, Operations, and Orchestration
September 30, 2019 (Report)
Data engineering performs the lifecycle of work involved in assembling data for analytics. This means architecting, designing, building, operating and adapting the pipelines that ingest, process and deliver data from source to consumer. While the discipline was born in the world of SQL, ETL and data warehousing, data engineering evolved in recent years to address data science. Many data engineers now execute projects that support scripting with python and R, often to apply advanced algorithms to semi- or unstructured data.
- Dave Wells in Architecting and Automating Data Pipelines: A Guide to Efficient Data Engineering for BI and Data Science
October 16, 2020 (Report)
Relevant Content
Apr 01, 2018 - Data engineering is one of the hottest and most difficult jobs to fill in the field of analytics. Breadth and depth of required skills limits the number...
Dec 02, 2016 - Big data management software binds a big data environment together, fostering continuous alignment of data with dynamic business needs.
Jan 06, 2019 - It’s not technology that determines success in data analytics; it’s people, processes, and programs, which is the focus of this special report.
Related Terms
Unleash The Power Of Your Data
Providing modern, comprehensive data solutions so you can turn data into your most powerful asset and stay ahead of the competition.
Learn how we can help your organization create actionable data strategies and highly tailored solutions.
© Datalere, LLC. All rights reserved