
Hiring Now
Urgent Hiring
Senior Data Engineer
Designation:
Senior Data Engineer
Experience:
7+ years
Location:
Kolkata
Summary
We are seeking a highly experienced Senior Data Engineer with a minimum of 7 years of experience to design, build, and optimize our next-generation data pipelines. In this role, you will be responsible for building robust data infrastructure that handles diverse data modalities, establish the ETL. The ideal candidate must have specialized, hands-on experience handling multi-modal data streams, including image/video processing, natural language text, and high-frequency time- series datasets.
Key Responsibilities
Pipeline Development: Design, implement, and maintain scalable batch and real time data pipelines (ETL/ELT).
Multi-Modal Data Handling: Build optimized ingestion and processing architectures specifically for unstructured video, image, and text data, as well as structured time series metrics.
Infrastructure Management: Architect data lakes, data warehouses, and feature stores to support downstream analytics and advanced Machine Learning workflows.
Performance Tuning: Optimize data workflows for high throughput, low latency, and efficient storage utilization.
Collaboration: Partner closely with Data Science, ML Engineering, and Product teams to translate complex data requirements into production-ready pipelines.
Required Qualifications & Experience
General Experience:
Overall Career: Minimum 7+ years of professional software engineering and data engineering experience.
Core Language Proficiency: Strong programming skills in Python, Scala, or Java.
Database & Querying: Expert-level mastery of advanced SQL and database design.
Mandatory Domain-Specific Experience
Image/Video Processing (3+ Years Mandatory):
Proven experience with image and video manipulation libraries (e.g.,OpenCV, FFmpeg, Pillow).
Familiarity with extracting, decoding, transformation, and structural cataloging of large-scale unstructured visual media files.
Text Processing (Mandatory):
Deep experience handling unstructured text data pipelines, tokenization, embeddings extraction, and NLP prep workflows.
Experience with vector databases (e.g., Pinecone, Milvus, Chroma, Qdrant).
Time-Series Data Handling (Mandatory):
Strong expertise working with high-frequency time-series datasets, windowing functions, and real-time streaming aggregation.
Experience using specialized time-series databases or frameworks (e.g.,InfluxDB, TimescaleDB, Apache Druid).
Big Data & Cloud Stack
Distributed Computing: Extensive experience using distributed engines like ApacheSpark, Flink, or Hadoop.
Streaming Frameworks: Hands-on experience with streaming platforms like Apache Kafka or AWS Kinesis.
Cloud Platforms: Proficient with major cloud providers (AWS, GCP, or Azure) and managed data services (e.g., Snowflake, BigQuery, Databricks).
Orchestration: Experience with workflow management tools like Apache Airflow, Prefect, or Dagster.
Preferred Qualifications
Experience building Feature Stores for machine learning models (e.g., Feast, Tecton).
Knowledge of containerization and orchestration via Docker and Kubernetes.
Familiarity with MLOps frameworks.
Quick Links

Invariz transforms fragmented enterprise systems into orchestrated intelligence through bespoke AI solutions that unify data, decisions, and outcomes. We help organizations evolve from ideas to enterprise capabilities with products, solutions, and dedicated centers that scale intelligence for long‑term impact.

Adventz Group is a dynamic Indian conglomerate with global ambitions, driving growth through agriculture, engineering, infrastructure, real estate, and consumer services. With over 75 years of legacy, 6,000 employees, and world‑class partnerships, it continues to transform industries and contribute to India’s prosperity.
© Invariz - 2026. All rights reserved