Data Engineer
Apply now »Date: Oct 31, 2024
Location: Shanghai, SH, CN, 200031
Company: Corning
Requisition Number: 65496
Corning is vital to progress – in the industries we help shape and in the world we share.
We invent life-changing technologies using materials science. Our scientific and manufacturing expertise, boundless curiosity, and commitment to purposeful invention place us at the center of the way the world interacts, works, learns, and lives.
Our sustained investment in research, development, and invention means we’re always ready to solve the toughest challenges alongside our customers.
The global Information Technology (IT) Function is leading efforts to align IT and Business Strategy, leverage IT investments, and optimize end to end business processes and associated information integration technologies. Through these efforts, IT helps to improve the competitive position of Corning's businesses through IT enabled processes. IT also delivers Information Technology applications, infrastructure, and project services in a cost efficient manner to Corning worldwide.
Responsibilities
As a Data Engineer for our advanced analytics platforms, your main responsibilities will be to:
- Design and implement patterns of practice for productized, portable, modular, instrumented, CI/CD automated and highly performant data ingestion pipelines that leverage structured streaming techniques, processing both batch and streamed data in unstructured, semi-structured and structure form, using Apache Spark, Delta lake, Delta Engine, Hive and other relevant tech stacks
- Ensure that data ingestion pipelines built with these patterns validate and profile inbound data reliably, identify anomalous and trigger appropriate remediation actions by operations staff when needed
- Filter, consolidate and contextualize ingested data and further aggregate according to data analytics and ML requirements.
- Use agile development practices, and continually improve development methods with the goal of automating the build, integration, deployment and monitoring of ingestion, enrichment and ML pipelines
- Using your expertise and influence, help establish patterns of practice for the above, and encourage their adoption by software and data engineering teams across the company
- Monitor and optimize the performance of data pipelines to ensure efficient data processing and storage.
Education, Experience, and Licensing:
- Advanced degree in computer science, but at a minimum a bachelor's degree in
- 5 years of programming proficiency in, at least one modern programming language (e.g. Java, C#, JavaScript, etc.) and at least one other high-level programming language such as Python SQL
- Expert level proficiency with agile software development & continuous integration + continuous deployment methodologies along with supporting tools such as Git (Gitlab), Terraform
- 5+ years of experience in big data engineering roles, developing and maintaining ETL and ELT pipelines for data warehousing, on-premise and cloud datalake environments
- 3+ years experience with AWS platform services, including AWS S3 & EC2, Data Migration Services (DMS), RDS, EMR, RedShi0ft, Lambda, DynamoDB, CloudWatch, CloudTrail
- Strong technical collaboration and communication skills
Technical Qualifications:
- Proficiency with functional programming methods and their appropriate use in distributed systems
- Expert proficiency with data management fundamentals and data storage principles
- Expert proficiency with AWS foundational compute services, including S3 and EC2, ECS and EKS,
- IAM and CloudWatch
- Experience with Databricks, including the use of Databricks notebooks, clusters, and jobs
Other Qualifications:
- Demonstrated curiosity and an ability to learn new skills on an ongoing, sustained basis
- Demonstrated systems perspective when analyzing problems, thinking about overall operation, failure modes and how to address these problems proactively
- A strong sense for the importance of documentation, and the importance of not having to learn things twice
- Ability to work in an agile product team environment and balance a diverse set of stakeholder requests
- Excellent oral and written communication skills with an ability to break down complex technical systems to help business partners understand the value
- Strong technical collaboration and communication skills as well as the ability to drive cultural change and adoption of best practices through community participation
- Ability to collaborate with other teams across the company, defining technology roadmaps, sharing experiences and lessons learned for continual improvement
- Excellent problem-solving and troubleshooting skills
- Process-oriented with great documentation skills
- Experience with data visualization tools and techniques
- Familiarity with machine learning frameworks and libraries
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. To request an accommodation, please contact us at accommodations@corning.com.