Senior Data Engineer - Python & Pyspark
- Job Req Id:
- 25891701
- Location(s):
- Chennai, Tamil Nadu, India
- Job Type:
- Hybrid
- Posted:
- Nov. 11, 2025
Discover your future at Citi
Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.
Job Overview
- Data Architecture & Design:Design, develop, and optimize data architectures, pipelines, and data models to support various business needs, including analytics, reporting, and machine learning.
- ETL/ELT Development (Python/PySpark Focus): Build, test, and deploy highly scalable and efficient ETL/ELT processes usingPython and PySpark to ingest, transform, and load data from diverse sources into data warehouses and data lakes. Develop and optimize complex data transformations using PySpark.
- Data Quality & Governance:Implement best practices for data quality, data governance, and data security to ensure the integrity, reliability, and privacy of our data assets.
- Performance Optimization:Monitor, troubleshoot, and optimize data pipeline performance, ensuring data availability and timely delivery, particularly for PySpark jobs.
- Infrastructure Management:Collaborate with DevOps and MLOps teams to manage and optimize data infrastructure, including cloud resources (AWS, Azure, GCP), databases, and data processing frameworks, ensuring efficient operation of PySpark clusters.
- Mentorship & Leadership:Provide technical guidance, mentorship, and code reviews to junior data engineers, particularly in Python and PySpark best practices, fostering a culture of excellence and continuous improvement.
- Collaboration: Work closely with data scientists, analysts, product managers, and other stakeholders to understand data requirements and deliver solutions that meet business objectives.
- Innovation:Research and evaluate new data technologies, tools, and methodologies to enhance our data capabilities and stay ahead of industry trends.
- Documentation: Create and maintain comprehensive documentation for data pipelines, data models, and data infrastructure.
- Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related quantitative field.
- 5+ years of professional experience in data engineering, with a strong emphasis on building and maintaining large-scale data systems.
- Extensive hands-on experience with Python for data engineering tasks.
- Proven experience with PySpark for big data processing and transformation.
- Proven experience with cloud data platforms (e.g., AWS Redshift, S3, EMR, Glue; Azure Data Lake, Databricks, Synapse; Google BigQuery, Dataflow).
- Strong experience with SQL and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra).
- Extensive experience with distributed data processing frameworks, especially Apache Spark.
- Programming Languages: Expert proficiency inPython is mandatory. Strong SQL mastery is essential. Familiarity with Scala or Java is a plus.
- Big Data Technologies: In-depth knowledge and hands-on experience withApache Spark (PySpark) for data processing, including Spark SQL, Spark Streaming, and DataFrame API. Experience with Apache Kafka, Apache Airflow, Delta Lake, or similar technologies.
- Data Warehousing: In-depth knowledge of data warehousing concepts, dimensional modeling, and ETL/ELT processes.
- Cloud Platforms: Hands-on experience with at least one major cloud provider (AWS, Azure, GCP) and their data services, particularly those supporting Spark/PySpark workloads.
- Containerization: Familiarity with Docker and Kubernetes is a plus.
- Version Control: Proficient with Git and CI/CD pipelines.
- Excellent problem-solving and analytical abilities.
- Strong communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders.
- Ability to work effectively in a fast-paced, agile environment.
- Proactive and self-motivated with a strong sense of ownership.
- Experience with real-time data streaming and processing using PySpark Structured Streaming.
- Knowledge of machine learning concepts and MLOps practices, especially integrating ML workflows with PySpark.
- Familiarity with data visualization tools (e.g., Tableau, Power BI).
- Contributions to open-source data projects.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Data Analytics------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Global Benefits
Discover the top benefits offered to our global workforce, designed to support your well-being, growth and work-life balance. Explore a few of the highlights that make working with us rewarding.
Explore More Jobs
-
Mehnaty Analyst Program - Wealth Ops (UAE Nationals Only)
- Dubai, Dubai
-
Senior Pega Developer - Assistant Vice President
- Chennai, Tamil Nadu
-
Senior Data Engineer - Python & Pyspark
- Chennai, Tamil Nadu
-
Regulatory Reporting Senior Analyst, Assistant Vice President
- Amsterdam, North Holland
-
Early Careers Talent Network
Sign up to receive personalized job matches based on your skills and interests. We'll help you discover opportunities that align with your goals.
-
Career Professionals Talent Network
Sign up to receive tailored job matches based on your skills and experience. Discover opportunities that align with your ambitions.