Share this Job

Senior Data Engineer, Advanced Analytics Platforms (Remote)

Apply now »

Date: Sep 20, 2022

Location: Charlotte, NC, US, 28216 North Carolina, NC, US New Jersey, NJ, US New York, NY, US Massachusetts, MA, US Corning, NY, US, 14831

Company: Corning

Requisition Number: 52646


Corning is one of the world’s leading innovators in materials science. For more than 160 years, Corning has applied its unparalleled expertise in specialty glass, ceramics, and optical physics to develop products that have created new industries and transformed people’s lives.

Corning succeeds through sustained investment in R&D, a unique combination of material and process innovation, and close collaboration with customers to solve tough technology challenges.

The global Information Technology (IT) Function is leading efforts to align IT and Business Strategy, leverage IT investments, and optimize end to end business processes and associated information integration technologies.  Through these efforts, IT helps to improve the competitive position of Corning's businesses through IT enabled processes.  IT also delivers Information Technology applications, infrastructure, and project services in a cost efficient manner to Corning worldwide.


Location: Corning, NY; Charlotte, NC or Remote


Data, automation and advanced analytics technologies are drastically transforming industrial manufacturers beyond point process automation to systemic, highly contextualized and data driven systems.   Corning is building the foundational digital infrastructure for these company-wide efforts, and are looking for passionate, hard-working, and talented staff-level software engineers that will develop and enhance the data ingestion pipelines that are crucial to these efforts.



The Data Engineer, Advanced Analytics platforms will work with our core platform development team as well as domain experts, application developers, controls engineers and data scientists.   Their primary responsibility will be to develop reliable and instrumented data ingestion pipelines that land inbound data from multiple process and operational data stores throughout the company to on-premise and cloud-based data lakes.  These pipelines will require data validation and data profiling automation along with version control and CI/CD to ensure ongoing resiliency and maintainability of the inbound data flows supporting our advanced analytics projects.



As a data engineer for our advanced analytics platforms, your main responsibilities will be:

  • Design, test, deploy and maintain production big-data ingestion pipelines using established frameworks, patterns of practice, agile software development and CI/CD practices, working closely with the Principal Software Engineer – Data Ingestion
  • Work with cross-organizational data source teams to define data ingestion requirements for structured, unstructured and semi-structured data, pilot their implementation, ensure the data source teams accept the resulting landed data as valid
  • Define and implement automated validation and profiling capabilities needed to ensure reliable data delivery, using agile software development and CI/CD practices
  • Work with data source teams, domain experts and data scientists to define data cleansing and data enrichment requirements for landed data
  • Implement data cleansing and enrichment code using established patterns of practice
  • Work with data source teams, domain experts and data scientists to validate landed, cleansed and enriched data, using agile software development and CI/CD practices, while ensuring that the final datasets are directly usable by them without additional processing effort
  • Actively participate in code reviews and technical information sharing with your team members and the broader software engineering community at Corning
  • Stay up to date with industry standards and technological advancements that will improve the quality, productivity and performance of your work.
  • Provide support in a DevOps environment to monitor tokens, jobs and overall system performance.



  • Bachelor's degree in computer science, engineering, mathematics, or a related technical discipline
  • 5+ years of experience in big data engineering roles, developing and maintaining ETL and ELT pipelines for data warehousing, on-premise and cloud datalake environments
  • 5+ years of demonstrated production programming proficiency in at least one modern JVM language such as Java, Scala or Kotlin, as well as an interpreted declarative programming language such as Python
  • 3+ years of experience developing batch, micro-batch and streaming ingestion pipelines using high-level Apache Spark APIs (pySpark, SparkR, SparkSQL and Scala)
  • 3+ years of production experience using SQL and DDL
  • 2+ years DevOps experience with AWS platform services, including AWS S3 & EC2, Data Migration Services (DMS), RDS, EMR, RedShift, Lambda, DynamoDB, CloudWatch, CloudTrail
  • Strong, hands-on technical familiarity with Apache Spark architecture, S3, parquet and Delta Lake architecture, technologies and tools
  • Expert level proficiency with both traditional relational and polyglot persistence technologies
  • Expert level proficiency with agile software development & continuous integration + continuous deployment methodologies along with supporting tools such as Git (Gitlab), Jira, Terraform, New Relic
  • Strong, hands-on familiarity with notebook environments including JupyterHub
  • Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data engineering strategy


Preferred Qualifications

  • Prior full-stack app development experience (front-end, back-end, microservices)
  • Familiarity with the following tools and technology practices:
    • Oracle, Microsoft SQL Server, SSIS, SSRS
    • Established enterprise ETL and integration tools including Informatica, Mulesoft
    • Established opensource data integration and DAG tools including NiFi, Streamsets, Airflow
    • Data sources and integration solutions commonly used in manufacturing enterprises, including Pi Integrator, Maximo
    • Reporting and analysis tools including PowerBI, Tableau, SAS JMP


We prohibit discrimination on the basis of race, color, gender, age, religion, national origin, sexual orientation, gender identity or expression, disability, veteran status or any other legally protected status.


We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Nearest Major Market: Charlotte