
Designing a Data Literacy Approach for Data Engineers - Interview w/ Dan Sullivan
આDEE
Description
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released In this episode, Scott interviewed Dan Sullivan, Principal Data Architect at 4 Mile Analytics. A key point Dan brought up is tech debt around data. Taking on tech debt should ALWAYS be a very conscious choice. But the way most organizations work with data, it is much more of an unconscious choice, especially by data producers, who are taking on debt that the data engineering teams will have to pay down. We need to find ways to deliver value quickly but with discipline. Zhamak has mentioned in a few talks that data engineers soon may not exist in orgs deploying data mesh. Dan actually somewhat agrees that data engineering will change a lot as right now, there is a big rush to build out the initial iterations of data products (the industry definition). Going forward, Dan thinks there will be a need for data engineers that can really understand consumer needs and build the interactions, e.g. the SDKs, to leverage data. Dan has 3 key pillars for driving data literacy for data engineers are domain knowledge, learning, and collaboration. Data engineers should pair with business people to acquire domain knowledge, they should be given the opportunity to spend time doing things like online training to learn, and they should collaborate across the organization instead of just being ticket tacklers. Per Dan, not all data engineers are the same depending on background - some come from a data analyst/data science background but many come from a software engineering background. So we can't treat training all data engineers as if it's the same. But we do need them to have a well-rounded background. A big need is for them to understand more about the data consumers and/or the producers so embedding them in the domains can really help. For driving buy-in with data engineers, Dan p