Data Engineer (Remote)

Data Engineer | ISITCA PRIVATE LIMITED | India

Job Description: Web Scraping Engineer

Position Overview:

We are seeking a highly skilled and experienced Web Scraping Engineerto join our dynamic
...

Data Engineer | ISITCA PRIVATE LIMITED | India

Job Description: Web Scraping Engineer

Position Overview:

We are seeking a highly skilled and experienced Web Scraping Engineer to join our dynamic

team. The ideal candidate will have a strong technical background in developing and

maintaining web scraping scripts and workflows. This role involves collaborating with data

science and engineering teams to extract, process, and validate data from various websites

efficiently and at scale.

Key Responsibilities:

● Web Scraping Development: Design, develop, and maintain advanced web scraping

scripts and crawlers to extract structured and unstructured data from a variety of

websites.

● Collaboration: Partner with data science and engineering teams to identify data sources

and define requirements for web scraping initiatives.

● Automation: Write clean, efficient, and scalable code to automate data extraction and

processing tasks.

● Data Pipelines: Design and implement data pipelines, managing workflows and

dependencies using Apache Airflow or similar tools.

● Workers, Queues, and Caching: Develop solutions utilizing tools like Celery and Redis

for efficient task scheduling, queue management, and caching.

● ML Ops & DevOps Integration: Deploy and monitor web scraping pipelines in

production environments, adhering to best practices in ML Ops and DevOps.

● Dynamic Web Handling: Solve challenges such as dynamic content, cookies, sessions,

and CAPTCHAs associated with modern web scraping.

● Error Handling & Data Validation: Implement robust error-handling mechanisms and

data validation techniques to ensure high-quality and accurate data extraction.

● Data Analysis: Perform data analysis and transformation using tools like Pandas,

ensuring extracted data is ready for downstream processing.

● Web Automation: Utilize Selenium or similar tools to interact with dynamic web

elements during scraping and automation tasks.

● Industry Knowledge: Stay current with emerging technologies, trends, and best

practices in the field of web scraping, automation, and data extraction.

Required Qualifications:

● Bachelor’s degree in Computer Science, Information Technology, or a related field.

● Proven work experience in a similar role, with a minimum of 5 years specializing in web

scraping.

● Proficiency in Python and experience with web scraping libraries such as Scrapy,

Beautiful Soup, Selenium, or Puppeteer.
Familiarity with common data formats (JSON, XML, CSV) and expertise in using Pandas

for data manipulation.

● Hands-on experience with Apache Airflow for building and managing data pipelines,

including a strong understanding of DAG concepts.

● Knowledge of ML Ops and DevOps practices to deploy and manage pipelines in

production.

● Excellent problem-solving and debugging skills, with attention to detail and accuracy.

● Strong communication and collaboration skills with the ability to work effectively under

strict deadlines.

Preferred Skills:

● Experience in handling dynamic web pages, managing cookies, and working around

CAPTCHAs.

● Experience with task queues and caching mechanisms (Celery, Redis).

● Experience in working with cloud services for deploying scalable scraping architectures.

Compensation and Benefits:

● Salary: ₹40,000.00 – ₹50,000.00 per month (commensurate with experience)

● Benefits:

○ Comprehensive health, dental, and vision insurance.

○ Retirement savings plan with employer matching.

○ Paid time off and holidays.

○ Flexible work schedule with the option for day, night, or weekend shifts.

○ Opportunities for professional development and career growth.

Contract Details:

● Job Type: Full-time, Temporary, Freelance, Volunteer

● Contract Length: 4 months

● Work Location: Remote

● Schedule: Day shift, Night shift, Weekend availability

● Supplemental Pay: Commission pay available

Experience Requirements:

● Minimum 4 years of experience in Python.

● Minimum 5 years of experience in web scraping.

● At least 3 years of experience with queue/caching systems (Celery/Redis), Apache

Airflow, or similar

Show less

Tagged as: remote, remote job, virtual, Virtual Job, virtual position, Work at Home, work from home