Data Integration Engineer vacancy at Syndigo
Beginning in 2017, under a new leadership team working to deliver on an updated vision, we expanded our offerings to serve the broader brand, distributor, and retailer ecosystem, including industrial and retail foodservice, grocery, consumer goods, hardlines and automotive. This journey included building and acquiring additional detailed and verified product information solutions to help consumers and operators in their buying decisions; store optimization services for effective in-store layouts and shelf merchandising; syndication of data to GS1 global standards via GDSN; publishing of enhanced product content integrated into retail sites globally; and interactive tools to allow restaurant and foodservice brands to organize and share nutrition data with their customers.
Our clients all benefit from Syndigo’s integrated platform, Content Experience Hub – which enables them to collect, store, manage, audit, syndicate and publish their content through our solutions across the largest trading network in the world.
General overview of the role
The Data Integration Engineer is responsible for implementing data ingestion, validation, and transformation pipelines at Syndigo. In collaboration with the Product, Development, and Enterprise Data teams, the Data Integration Engineer will design and maintain batch and streaming integrations across a variety of data domains and platforms. The ideal candidate is experienced in big data, cloud architecture, and is excited to advance innovative analytics solutions!
The preference is that this role be based in one of Syndigo's offices in Brookfield WI, Nashville TN, Chicago IL, or Lisle IL, but well qualified candidates in other parts of the United States can be considered as fully remote employees.
Work with stakeholders to define and develop data ingest, validation, and transform pipelines
Participate in solution and architecture design & planning
Troubleshoot data pipelines and resolve issues in alignment with SDLC
Ability to diagnose and troubleshoot data issues, recognizing common data integration and transformation patterns
Estimate, track, and communicate status of assigned items to a diverse group of stakeholders
3+ years experience in developing large scale data pipelines in a cloud environment
Demonstrated proficiency in Scala (Object Oriented Programming) / Python (Scala preferred), SPARK SQL
Experience with Databricks, including Delta Lake
Experience with Azure and cloud environments, including Azure Data Lake Storage (Gen2), Azure Blob Storage, Azure Tables, Azure SQL Database, Azure Data Factory
Experience with ETL/ELT patterns, preferably using Azure Data Factory and Databricks jobs
Fundamental knowledge of distributed data processing and storage
Fundamental knowledge of working with structured, unstructured, and semi structured data
Excellent analytical and problem solving skills
Ability to effectively manage time and adjust to changing priorities
Bachelor’s degree preferred, but not required