Crawler PNG and SVG Icon
AWS Glue Crawler is a component that automatically scans data sources, infers schemas, and creates metadata tables in the AWS Glue Data Catalog.
Last Modified: August 29, 2025
16px
32px
48px
64px
Details
Key Features
- Automatically scans data sources to detect schema and metadata.
- Populates AWS Glue Data Catalog entries.
- Supports incremental crawls for efficiency.
- Integrates with Amazon S3, RDS, Redshift, and JDBC sources.
Common Use Cases
- Automatically discovering and cataloging new datasets in S3
- Updating schema changes in AWS Glue Data Catalog
- Classifying data by file type and structure for ETL jobs
Explore More Icons
Textract
Amazon Textract is an AI service that automatically extracts text, tables, and other data from scanned documents and PDFs.
Managed Blockchain
Amazon Managed Blockchain is a fully managed service that makes it easy to create and manage scalable blockchain networks using popular open-source frameworks like Hyperledger Fabric and Ethereum.
FSx
Amazon FSx provides fully managed third-party file systems optimized for a range of workloads including Windows File Server, Lustre, NetApp, and OpenZFS.
DataSync
AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage and AWS.
Data Lake
AWS Data Lake is a centralized, scalable, and secure data repository that allows you to store and analyze all your structured and unstructured data.
Cognito
Amazon Cognito provides user authentication, authorization, and user management for web and mobile apps, with social and enterprise identity federation support.
Elemental Link
AWS Elemental Link is a hardware device that connects a live video source to AWS Elemental MediaLive for high-quality, low-latency cloud-based encoding.
Entity Resolution
AWS Entity Resolution is a machine learning-powered service that helps match, link, and deduplicate records across datasets for accurate data consolidation.
Compute Auto Scaling
AWS Auto Scaling automatically adjusts the capacity of your AWS resources to maintain steady, predictable performance at the lowest possible cost.
IoT Core
AWS IoT Core allows connected devices to securely interact with cloud applications and other devices, enabling scalable IoT solutions.
Deep Learning AMIs
AWS Deep Learning AMIs are pre-configured Amazon Machine Images optimized for ML frameworks such as TensorFlow, PyTorch, and MXNet.
Signer
AWS Signer is a fully managed code-signing service to help ensure the integrity and trustworthiness of your code by digitally signing it before deployment.
QuickSight
Amazon QuickSight is a cloud-powered business intelligence (BI) service that enables you to visualize and share insights from your data with interactive dashboards.
Elastic Transcoder
Amazon Elastic Transcoder is a media transcoding service in the cloud designed to convert media files into formats required by playback devices.
Fraud Detector
Amazon Fraud Detector is a service that uses machine learning to identify potentially fraudulent online activities in real time.
Ground Station
AWS Ground Station is a fully managed service that lets you control satellite communications, process data, and scale operations without building ground infrastructure.
Control Tower
AWS Control Tower provides a guided setup to create a secure, multi-account AWS environment based on AWS best practices.
Local Zones
AWS Local Zones bring compute, storage, and other services closer to large population centers to support latency-sensitive applications.
Application Discovery Service
AWS Application Discovery Service helps you plan migration projects by collecting usage and configuration data from your on-premises servers.
EKS Distro
Amazon EKS Distro (EKS-D) is the open-source distribution of the same Kubernetes components used by Amazon EKS, enabling consistent cluster operations on any infrastructure.
HealthLake
Amazon HealthLake is a HIPAA-eligible service that stores, transforms, and analyzes health data in the FHIR format for advanced analytics and ML.
Audit Manager
AWS Audit Manager helps you continuously audit your AWS usage to simplify risk assessment and compliance with regulations and industry standards.
Lookout for Metrics
Amazon Lookout for Metrics automatically detects and diagnoses anomalies in business and operational data using ML models.
NAT Gateway
A NAT Gateway enables instances in a private subnet to connect to the internet while preventing unsolicited inbound traffic.