Harish Chander Ramesh,阿拉伯联合酋长国迪拜的开发者
Harish is available for hire
Hire Harish

Harish Chander Ramesh

Verified Expert  in Engineering

Data Engineer and Developer

Location
Dubai, United Arab Emirates
Toptal Member Since
April 22, 2022

Harish is a data engineer who has been consuming, engineering, analyzing, exploring, testing, 在过去的十年里,我一直在为个人和职业目的可视化数据. 他对数据的热情使他与多家财富50强企业合作, including Amazon and Verizon. 哈里什喜欢挑战,并相信当他离开自己的舒适区时,他可以学习和表现得最好.

Portfolio

MH Alshaya
Apache气流,Apache Spark,谷歌云平台(GCP),谷歌分析...
Verizon Media
Apache Airflow, Apache Spark, Python, Tableau, ELK (Elastic Stack), Datadog...
Amazon
Apache气流,Apache Spark, Tableau, ETL,仪表板,数据可视化...

Experience

Availability

Full-time

Preferred Environment

Google Cloud Platform (GCP), Tableau, Microsoft Power BI, SQL, ETL, Business Intelligence (BI), Data Visualization, Amazon Web Services (AWS), Google BigQuery, Azure SQL Databases, Data Engineering, AWS Data Pipeline Service, Data Management, Collibra, Informatica Cloud, Informatica ETL, Informatica, Oracle, JavaScript, Data Architecture, Excel 365, CSV File Processing, Excel VBA, Data Extraction, MySQL

The most amazing...

...我从头开始构建的数据平台是用于视频会议应用程序, 在大流行期间,尽管使用量增加了600%,但哪家公司没有停机.

Work Experience

Data Engineer Manager

2021 - 2022
MH Alshaya
  • Developed the first-ever Data warehouse from scratch, incorporating product analytics at scale, using various GCP services.
  • Developed the Golden Customer Record in real-time, 将119个品牌的忠诚计划扩展到19个国家.
  • 在整个内部业务团队的帮助下开发和维护数据质量框架, using Great Expectations at scale. 这也被用于近实时的50多个品牌的欺诈分析.
  • Led a team of six data engineers, the first set of data engineers in the organization, and started up a data-driven culture within the team.
Technologies: Apache气流,Apache Spark,谷歌云平台(GCP),谷歌分析, Tableau, ETL, Dashboards, Data Visualization, Amazon EC2, Amazon RDS, Databases, Redshift, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Spark, Apache Kafka, Data Warehouse Design, Data Lake Design, Big Data Architecture, Data Warehousing, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Analytics, Google Cloud, Data Analysis, Data Analytics, Data Science, Terraform, Data Governance, Azure, PostgreSQL, Cloud Platforms, Looker, Parquet, BigQuery, Database Schema Design, Data Management, Azure Synapse, Collibra, Informatica Cloud, Informatica ETL, Informatica, Ads, User Interface (UI), Excel 2016, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Oracle Cloud, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Data Migration, Finance, Mobile Analytics, Firebase, Data Extraction, Amazon Web Services (AWS), ELT, Database Architecture, Database Performance, Database Development, AWS Lambda, Docker, Microservices, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks

Lead Data Engineer

2019 - 2021
Verizon Media
  • 开发了第一个流媒体分析平台,使用Apache Spark和Storm在aws管理的服务上处理来自视频会议解决方案的媒体统计.
  • Built a data pipeline that autoscaled itself, 尽管由于客户团队之间实施远程工作,每日使用量增加了600%,但没有受到COVID-19大流行的影响.
  • 在Apache Hudi开发的早期阶段测试并实现了它, 还提供了ACID事务处理历史数据的能力.
  • 带领7名数据工程师团队,3名大四学生,2名大三学生,1名实习生. 创造机会与世界各地的大客户就技术解决方案咨询和解决方案架构进行互动.
  • 将一个PostgreSQL的实时遗留数据库迁移到Snowflake,并在进程上使用DBT,大小为2.2 PB in five days. Designed, implemented, 并在错误报告框架0的帮助下动态地验证迁移.3% of errors.
Technologies: Apache Airflow, Apache Spark, Python, Tableau, ELK (Elastic Stack), Datadog, Kafka Streams, ETL, Dashboards, Data Visualization, Amazon EC2, Amazon RDS, Databases, Redshift, Storm, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Spark, Big Data, Apache Kafka, Data Warehouse Design, Data Lake Design, Spark Streaming, Big Data Architecture, Data Warehousing, PySpark, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Looker, Analytics, Google Cloud, Data Analysis, Snowflake, Data Analytics, Data Governance, Azure, PostgreSQL, pgAdmin, Data Build Tool (dbt), Cloud Platforms, Parquet, BigQuery, AWS Data Pipeline Service, Django, Database Schema Design, Data Management, Azure Synapse, Collibra, Informatica Cloud, Informatica ETL, Informatica, Amazon QuickSight, Ads, User Interface (UI), Excel 2016, JavaScript, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Oracle Cloud, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Microsoft SQL Server, Data Migration, Finance, Mobile Analytics, Firebase, Data Extraction, MySQL, ELT, Database Architecture, Database Performance, Database Development, AWS Lambda, AWS CloudFormation, Docker, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks, Delta Lake

Data Engineer

2016 - 2018
Amazon
  • 为全球最大的电子商务平台做出了贡献,该平台覆盖全球不同时区的16个市场. 我是零售业务团队的一员,负责处理全球零售业务数据管理和管道.
  • 能够处理高压环境,并在紧迫的期限内完成任务. Worked alongside the best minds in the country and the world, 在组织内部发起一个数据工程师论坛,以便我们之间的思想交流.
  • 使用Spark构建实时管道,将来自不同平台的数据流传输到Amazon数据仓库,服务水平协议(SLA)的延迟为2分钟, Flink, and Tableau.
  • 创建了一个360度的仪表板,展示了亚马逊不同服务的客户. 仪表板在一个论坛上公开,并因易于消费者理解数据而受到广泛欢迎.
Technologies: Apache气流,Apache Spark, Tableau, ETL,仪表板,数据可视化, Amazon EC2, Databases, Redshift, Storm, Apache Flink, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Spark, Big Data, Apache Kafka, Data Warehouse Design, Data Lake Design, Spark Streaming, Big Data Architecture, Data Warehousing, PySpark, Data Lakes, Cloud Native, Data Engineering, Google BigQuery, Data Modeling, Looker, Data Analysis, Data Analytics, Cloud Platforms, BigQuery, Azure SQL Databases, AWS Data Pipeline Service, Django, Data Management, Amazon QuickSight, Ads, Oracle, Data Architecture, Data Quality, Great Expectations Cloud, AWS Glue, Excel 365, Office 365, CSV File Processing, MongoDB, ETL Implementation & Design, Amazon Elastic Container Service (Amazon ECS), Microsoft SQL Server, Data Migration, Mobile Analytics, Firebase, Data Extraction, MySQL, ELT, Hadoop, Database Performance, Database Development, AWS Lambda, AWS CloudFormation, Technical Architecture, ETL Tools, Monitoring, Cloud, Databricks, Delta Lake

Data Engineer

2013 - 2016
NTT Data
  • Developed, tested, 并为一家医疗保健提供商部署了端到端实时和批处理ETL管道.
  • 从业务角度记录每一行代码和对现有产品的更改.
  • 以开放的态度学习新技术,成长为一名不可知论的开发人员.
  • 开发两大数据仓库相关项目,节省23%的数据存储成本.5% of maintenance cost.
Technologies: Abinitio, SQL, Teradata, Amazon RDS, Amazon EC2, Databases, Amazon S3 (AWS S3), Data Pipelines, Amazon Web Services (AWS), Big Data, Data Warehousing, PySpark, Data Engineering, Data Analysis, Snowflake, Microsoft Access, Cloud Platforms, BigQuery, Azure SQL Databases, AWS Data Pipeline Service, Data Management, Azure Synapse, Informatica Cloud, Informatica ETL, Informatica, Amazon QuickSight, Oracle, Excel 2016, Data Architecture, Data Quality, Oracle Cloud, Excel 365, Office 365, CSV File Processing, ETL Implementation & Design, Microsoft SQL Server, Data Migration, Data Extraction, ELT, Hadoop, Database Development, AWS Lambda, ETL Tools, Cloud, Databricks, Delta Lake

Competitive Price Monitoring System for eCommerce Business

开发的数据框架将根据多个电子商务网站的超强竞争力抓取它们. 超级竞争力是对不同产品类别的竞争对手进行分类的指标, 每天抓取竞争对手网站一到三次. scraper脚本的输出将数据写入数据仓库,然后在产品到产品级别实时比较数据仓库以生成PCI. 价格竞争力指数(PCI)用于衡量电子商务企业产品与超级重要和重要竞争对手相比是否具有竞争力.

Real-time Pipelines for Fraud Alerting

这是一个视频会议应用程序,会议id容易被黑客攻击. 软件系统还不够成熟,无法识别会议中的欺诈行为, 所以我建立了一个数据层,可以在不到3秒的时间内捕获和报告欺诈会议id. This was implemented using more of an open-source stack, starting from Kafka, MemSQL (now known as SingleStore), Storm, Python, and Looker as a BI solution.

Driver's incentives Framework

一个实时计算平台,计算送货司机的目标与实际数字, reward them with instant bonuses, and encourage them to achieve more than the target. 这是为一家叫车公司开发的,司机的目标不是每天或每天向他们报告的. 一个Grafana仪表板被创建并嵌入到司机使用的移动应用程序中, so the drivers are aware of their performance, the incentives they have earned, and the targets to be achieved or already achieved.

Languages

SQL, Python, Snowflake, JavaScript, Excel VBA

Frameworks

Apache Spark, Spark, Storm, Hadoop, Django

Tools

Apache Airflow, Tableau, Microsoft Power BI, Abinitio, Kafka Streams, BigQuery, Collibra, Informatica ETL, Excel 2016, AWS Glue, ELK (Elastic Stack), Microsoft Access, pgAdmin, Amazon QuickSight, Amazon Elastic Container Service (Amazon ECS), Amazon CloudFront CDN, AWS CloudFormation, Google Analytics, Apache Storm, Logstash, Grafana, Terraform, Looker

Paradigms

ETL, Business Intelligence (BI), ETL Implementation & Design, Database Development, Data Science, Microservices

Platforms

Google Cloud Platform (GCP), Amazon EC2, Amazon Web Services (AWS), Firebase, AWS Lambda, Databricks, Apache Flink, Azure, Oracle, Docker, Apache Kafka, Cloud Native

Storage

Teradata, Redshift, Databases, Amazon S3 (AWS S3), Data Pipelines, Data Lake Design, PostgreSQL, Azure SQL Databases, AWS Data Pipeline Service, MongoDB, Microsoft SQL Server, Database Architecture, Database Performance, Datadog, Data Lakes, Google Cloud, Oracle Cloud, MySQL, MemSQL, Elasticsearch

Other

Software, Dashboards, Data Visualization, Amazon RDS, Big Data, Data Warehouse Design, Data Warehousing, Data Engineering, Google BigQuery, Data Analysis, Cloud Platforms, Data Management, Informatica Cloud, Informatica, Data Architecture, Excel 365, Office 365, CSV File Processing, Data Migration, Data Extraction, ELT, Technical Architecture, ETL Tools, Cloud, Delta Lake, Big Data Architecture, Data Modeling, Analytics, Data Analytics, Data Governance, Parquet, Database Schema Design, Fivetran, Airbyte, Azure Synapse, TIBCO, Ads, Data Quality, Finance, Mobile Analytics, Monitoring, Data Build Tool (dbt), User Interface (UI), Great Expectations Cloud

Libraries/APIs

PySpark, Spark Streaming

2009 - 2013

Bachelor of Engineering Degree in Electronics

Anna University - Chennai, India

JANUARY 2023 - PRESENT

Google Cloud Certified - Professional Data Engineer

Google Cloud

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

与你选择的人才一起工作,试用最多两周. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring