Be on Alert – Fraudulent Employment Offers. Learn More
Big Data Engineer
Position Title: Big Data Engineer
The data architecture and engineering team requires a senior engineer to help establish an Enterprise Data Lake and support new Hadoop-based initiatives. This individual will contribute to the success of both our product and data teams by being flexible and responsive to project requests while delivering quality solutions and services that support new operational and analytics demands while integrating to and preserving critical legacy components of data architecture.
As developer on this team you will collaborate with cross functional teams across the organization. You will be responsible for code and various data management activities that meet project and organizational requirements through collaboration with application developers, data and solution architects, infrastructure engineers, project managers, business analysts, QA, technical directors and account managers. The successful candidate will be adept at working in a dynamic environment, know how to deal with ambiguity and have a determined nature to learn and troubleshoot code and complete tasks that involve different aspects of multiple large database systems.
- Develop an understanding of the Tally platform and Operational/Analytics needs within 30 days
- Understand the software stack as a whole, the systems and data architecture, ETL processes and frameworks and maintenance practices.
- Understand application interactions with the database.
- Understand the project’s scope in order to identify when requirements are out of scope and require a change order.
- Help drive and maintain high quality standard of data architecture and database code changes within Hadoop Ecosystem, Elasticsearch, and Microsoft SQL Server
- Perform all phases of data engineering including requirements analysis, application design, and code development and testing.
- Use previous experience to evolve the data platform to include Hadoop services and an Enterprise Data Lake
- Respect, implement and manage strong data governance and security practices especially with respect to the Data Lake
- Estimate engineering work effort and effectively identify and prioritize the high impact tasks.
- Troubleshoot production support issues and identify solutions as required to backup the team for Operational activities.
- Ensure code is efficient and optimized for best performance.
- Ensure that objects are modeled appropriately.
- Review and test code changes in lower environments.
- Understand, manage and troubleshoot jobs and monitoring software while contributing scripts to improve predictability of system health, database or Hadoop cluster.
- Effectively work with the team and team workflow toolset to manage communication, status, issues and code quality.
- Should have basic experience with Atlassian tools, like SourceTree/Bitbucket, JIRA and Confluence.
- Create JIRA Tickets with enough information for the development team to estimate and resolve issues in a timely manner.
- Competently use version control (Git) to manage topic branches and Pull Requests.
- Review code and provide feedback relative to best practices and improving performance.
- Create and update tickets with enough information for the team to estimate and resolve issues in a timely manner.
- Follow build and automation practices to support continuous integration and improvement.
- 3 to 5 years of development experience in medium-large Hadoop implementations (Hortonworks preferred) supported by Java (map reduce) and/or Python expertise.
- 3+ years of experience with SQL Server development (T-SQL) versions 2012+ in medium-large database implementations.
- Demonstrated success moving data from SQL Server to Hadoop
- Ability to mentor and share Hadoop or Big Data knowledge effectively to help expand expertise throughout the organization
- Strong experience with some or all of Hadoop/hdfs commands, Sqoop, Hive, Pig, Kafka, Storm and Spark among other Big Data and NoSQL technologies.
- Deep knowledge of Hadoop file formats (e.g. Avro, Parquet, Orc, etc.) and their applicable use cases
- Champion for enforcement of data management and engineering best practices and keen focus on Data Lake organization and management of disparate data sources for Intake, Integration/Aggregation and Consumption of the Data Lake.
- Strong commitment toward preservation of data lineage, quality and integrity.
- Understanding of OLTP, OLAP/Data Warehouse (star schema) and mixed workloads
- Fair level of competency writing SQL queries and with relational database modeling and design.
- A solid grasp of the Git version control system.
- Ability to learn and expand use of Powershell to manage and monitor databases
- Some Experience with Elasticsearch and integration to SQL Server or Hadoop ecosystem and components.
- Basic knowledge of SQL Server Integration Services (SSIS) and SQL Server Reporting Services (SSRS).
Working at ICF
Working at ICF means applying a passion for meaningful work with intellectual rigor to help solve the leading issues of our day. Smart, compassionate, innovative, committed, ICF employees tackle unprecedented challenges to benefit people, businesses, and governments around the globe. We believe in collaboration, mutual respect, open communication, and opportunity for growth. If you’re seeking to make a difference in the world, visit www.icf.com/careers to find your next career. ICF—together for tomorrow.
Bangalore, India (II76)