Share this Job

Principal Software Engineer - Advanced Analytics - Emerging Technology

Apply now »

Date: Jun 14, 2022

Location: Charlotte, NC, US, 28216

Company: Corning

Requisition Number: 51010


Corning is one of the world’s leading innovators in materials science. For more than 160 years, Corning has applied its unparalleled expertise in specialty glass, ceramics, and optical physics to develop products that have created new industries and transformed people’s lives.

Corning succeeds through sustained investment in R&D, a unique combination of material and process innovation, and close collaboration with customers to solve tough technology challenges.

The global Information Technology (IT) Function is leading efforts to align IT and Business Strategy, leverage IT investments, and optimize end to end business processes and associated information integration technologies.  Through these efforts, IT helps to improve the competitive position of Corning's businesses through IT enabled processes.  IT also delivers Information Technology applications, infrastructure, and project services in a cost efficient manner to Corning worldwide.


Data, automation and advanced analytics technologies are drastically transforming industrial manufacturers beyond point process automation to systemic, highly contextualized and data driven systems.   Corning is building the foundational digital infrastructure for these company-wide efforts, and are looking for passionate, hard-working, and talented staff-level software engineering architects that will design that foundation for reuse, velocity and scale.



The Principal Software Engineer, Advanced Analytics will be part of a core platform development team working with domain experts, application developers, controls engineers, data engineers and data scientists.   Their primary responsibility will be to design large scale, distributed and modular contextualization, transformation and normalization systems that will serve prepared datasets for model training and inferencing. You will also develop outbound data engineering, integration and machine learning deployment pipelines and address lifecycle management of these data engineering and machine learning engineering pipelines, in coordination with their architecture peers and communities of practice throughout the company.


These systems will span both cloud and on-premise environments and will require close collaboration with many technical teams to ensure success.



As an Principal Software Engineer, Advanced Analytics, your main responsibilities will be:

  • Designing and implementing portable, modular, instrumented and highly performant data contextualization pipelines from landed and cleansed, batch and streamed unstructured data, using Apache Spark, Deltalake and/or Databricks
  • Designing and implementing portable, modular, instrumented and highly performant model deployment pipelines for many types of machine learning including supervised and unsupervised learning as well as CNNs, RNNs or other deep learning algorithms
  • Designing and implementing outbound data engineering pipelines that serve curated datasets to business intelligence, reporting and HMI systems
  • Working closely with domain expert data scientists, process and controls engineers, both within and outside the company to understand and automate transformation, normalization and other contextualization operations based on the types of analytics being performed on the inbound datasets, as well as model performance management requirements and design suitable inferencing instrumentation systems and practices that meet them
  • Delivering and presenting proofs of concept implementations that explain the key technologies you have selected for your design and the recommended patterns of practice for ongoing development and lifecycle management.  The target audience for these efforts span the company and include project stakeholders, data scientists, process experts, other domain architects and relevant technical communities of practice interested in leveraging your code for their own projects
  • Working with your fellow developers using agile development practices, and continually improving development methods with the goal of automating the build, integration, deployment and monitoring of production inferencing and dataset delivery systems
  • Working with the relevant communities of practice on component roadmaps, and serving as a trusted committer for your code for inner sourcing efforts with other development teams in the company


Education & Experience

  • Advanced degrees in computer science and data science strongly preferred, though an equivalent level engineering, data science or mathematics degree, a technical undergraduate degree and relevant experience will also be considered
  • 3+ years of experience working with data scientists in a large-scale data engineering or production machine learning inferencing capacity, working with various types of supervised and unsupervised learning algorithms for classification, recommendation, anomaly detection, clustering and segmentation, as well as CNNs, RNNs or other deep learning algorithms
  • 5+ years of full-stack experience developing large scale distributed systems and multi-tier applications
  • 10 years of programming proficiency in, at least, one modern JVM language (e.g. Java, Kotlin, Scala) and at least one other high-level programming language such as Python
  • 2+ years of production DevOps experience
  • 3+ years of programming on the Apache Spark platform, leveraging both low level RDD and MLlib APIs and the higher-level APIs (SparkContext, DataFrames, DataSets, GraphFrames, SparkSQL, SparkML).  Demonstrated deep understanding of Spark core architecture including physical plans, DAGs, UDFs, job management and resource management
  • At least 1 year of implementation experience with Apache Airflow, and a demonstrated expert level understanding of both segmented and unsegmented Directed Acyclic Graphs and their operationalization
  • Experience working with MLflow and a demonstrated ability to lead architecture efforts for its implementation
  • Demonstrated experience working with inner sourcing initiatives, serving both as a trusted committer and contributor
  • Strong technical collaboration and communication skills
  • Unwavering commitment to coding best practice and a strong proponent of code review
  • Cultural bias towards continual learning, sharing best practice, encouraging and elevating less experienced colleagues as they learn


Additional Technical Qualifications

  • Proficiency with functional programming methods and their appropriate use in distributed systems
  • Expert proficiency with AWS foundational compute services, including S3 and EC2, ECS and EKS, IAM and CloudWatch
  • Proficiency working with Ceph, Kubernetes and Docker
  • Expert proficiency with continuous integration and continuous deployment methodologies
  • Expert proficiency with data management fundamentals and data storage principles


Other Qualifications

  • Strong relationship building skills
  • Proven success working in highly matrix environment.
  • Excellent analytical and decision-making abilities.
  • Must demonstrate a proven willingness to go the extra mile, to take on the things that need to be done and maintain a positive attitude that can adapt to change.
  • Strong leadership and excellent verbal and written communications skills, with the ability to develop and sell ideas.


This position does not support immigration sponsorship.


We prohibit discrimination on the basis of race, color, gender, age, religion, national origin, sexual orientation, gender identity or expression, disability, veteran status or any other legally protected status.


We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Nearest Major Market: Charlotte