Location : Portland, ME 04101
Headquarters : Portland, ME
Hiring Mode : Full Time
Hiring Role : Developer
Experience : Mid Level
Job Duties and Responsibilities:
- Build data flow channels and processing systems to extract, transform, load and integrate data from various sources using Frameworks like Apache Nifi.
- Develop complex code, scripts and data pipelines to process structured and unstructured data near real-time using Apache Spark, Apache Kafka.
- Work with stakeholders to understand needs for data structure, availability and accessibility.
- Develop prototypes and proof of concepts for the selected solutions using Apache Flink.
- Migrate the existing Data in On-Premise Hive Data Warehouse to Cloud based Snowflake Data Warehouse.
- Design Data Models with layers like Raw, Curated and Reporting in Snowflake Data Warehouse.
- Translate load and exhibit unrelated data sets in various formats and sources like JSON, text files, Kafka queues and log data.
- Install and configure Docker images for Telegraf, InfluxDB, Grafana, Kapacitor on AWS Cloud monitoring EC2.
- Design and Develop Kapacitor scripts for alerting as push notifications, SMS, Email and Slack alerts.
- Architect, Design and Develop Big Data streaming applications to use high performance and highly available NoSQL Key-Value Store Redis for check pointing.
- Design and develop AWS Cloud deployment scripts using AWS Cloud Formation Templates, Terraform and Ansible.
- Design and Develop Spark applications in Scala that use DOM/SAX parsers for parsing incoming raw string/XML Data.
- Develop and deploy Chef scripts on centralized DEV, QA, PROD servers for installing Java 8, Apache Spark/Flink/Apex, Stunnel, Nginx http server, Telegraf agent, Apache Zookeeper on Standalone AWS Cloud Clusters.
- Automate the data collection and analysis processes, data releasing and reporting tools.
Work experience / Technologies required for the position:
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing big data data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large dis-connected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable big data data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- At least with 2+ years of experience in a Data Engineer.
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- 3+ years of experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Work location is Portland, ME with required travel to client locations throughout USA.
Rite Pros is an equal opportunity employer (EOE).
Please Mail Resumes to:
Rite Pros, Inc.
565 Congress St, Suite # 305
Portland, ME 04101.