Sr. Lead Data Engineer

over 4 years old

This job is no longer active

The Company

Hitachi Vantara combines technology, intellectual property and industry knowledge to deliver data-managing solutions that help enterprises improve their customers’ experiences, develop new revenue streams, and lower the costs of business. Hitachi Vantara elevates your innovation advantage by combining IT, operational technology (OT) and domain expertise. Come join our team and our employee-focused culture and help drive our customers’ data to meaningful customer outcomes.

The Role

As a Sr. Big Data Engineer you will provide engineering knowledge to create and enhance data solutions enabling seamless delivery of data across our enterprise. You will be on the cutting edge of finding and integrating new technology and tools for data centric projects. Additionally, you will provide technical consulting to peer data engineers during design and development of highly complex and critical data projects. Some of these projects will include designing and developing data ingestion and processing/transformation frameworks leveraging open source tools such as NiFi, EMR, Java, Scala, Spark APIs, AWS Glue, etc. Additionally, you will work on real time processing solutions using tools such as Spark Streaming, MQ, Kafka, and AWS Kinesis. You will deploy application code using CI/CD tools and techniques.

Responsibilities:

Develop data driven solutions utilizing current and next generation technologies to meet evolving business needs.
• Ability to quickly identify an opportunity and recommend possible technical solutions.
• Utilize multiple development languages/tools such as Python, SPARK, Hive to build prototypes and evaluate results for effectiveness and feasibility.
• Operationalize data ingestion and data-analytic tools for enterprise use.
• Utilize tools available to you across AWS Services
• Develop real-time data ingestion and stream-analytic solutions leveraging technologies such as Kafka, Apache Spark, NIFI, Python, HBase and Hadoop.
• Custom Data pipeline development (Cloud and locally hosted)
• Work heavily within AWS and Hadoop ecosystems
• Provide support for deployed data applications and analytical models by being a trusted advisor to Data Scientists and other data consumers by identifying data problems and guiding issue resolution with partner Data Engineers and source data providers.
• Provide subject matter expertise in the analysis, preparation of specifications and plans for the development of data processes.
• Ensure proper data governance policies are followed by implementing or validating Data Lineage, Quality checks, classification, etc

Qualifications

• Bachelor’s in computer science or related field is required (masters preferred)
• You have a minimum of 7 years of experience in the design, development, and deployment of large-scale, distributed, and cloud-deployed software services.
• Expected to be an expert in SQL and RDBMS. Good at modeling data for relational, analytical and big data workloads
• You have a minimum of 4 years of experience in Big Data software development technologies (e.g., Hadoop, Hive, Spark, Kafka) and exposure to resource/cluster management technologies.
• Minimum of 3 year of experience with AWS (e.g., EC2, S3, EMR, SNS, SQS, Aurora, Redshift).
• Experience with various software technologies/solutions and understand where to use them.
• 2 years of experience with Data Virtualization like Denodo

We are an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

#LI-DNI