SchemaCrawler Metadata Retrieval

SchemaCrawler gives you a normalized, vendor-agnostic view of your database’s schema. It starts with JDBC DatabaseMetaData and augments it with vendor catalogs and targeted SQL in database-specific modules where JDBC falls short. The goal is simple: when you point SchemaCrawler at different databases, you still get the same kinds of objects with the same kinds of details, so your documentation, diagrams, and impact analysis don’t break when the backend changes.

At the top level, SchemaCrawler understands servers, catalogs, and schemas. It uses those to scope everything else and preserve fully qualified names. Inside those namespaces, it discovers tables and views (including materialized views where the platform supports them) and treats them as first-class objects. Views are not just names; they carry dependencies on the underlying tables and other views, so you can see what breaks if a base table changes.

Columns are richly described. You get native and JDBC data types, nullability, default expressions, and details for identity and generated or computed columns. Hidden columns are shown if the database supports them. If the platform exposes collation or character set at the column level, SchemaCrawler captures that as well. For identity/auto-increment behavior, it keeps the metadata on the column and, where the engine supports standalone sequences and makes the link visible, it associates columns with their backing sequences.

Keys and constraints are modeled explicitly. Primary keys and unique constraints show up with their column order and names. Foreign keys include referenced tables, participating columns, and, when available, referential actions (on update/delete) and deferrability. Check constraints are included as expressions so your business rules are documented close to the data. If a database differentiates “validated” versus “not validated” or similar states and exposes that in the catalogs, SchemaCrawler records that nuance.

Indexes go beyond “name and columns.” You will see uniqueness, column order, and, on databases that provide the details, expressions for function-based indexes, predicates for partial or filtered indexes, and included (non-key) columns. These details matter for performance reviews and for explaining why the optimizer picks a particular plan, and SchemaCrawler keeps them in the model whenever the engine and driver make them available.

Routines — both procedures and functions — are first-class too. SchemaCrawler captures names, overloads, parameters with directions (IN/ OUT/ INOUT), and return types, including table-returning functions when the database supports them. On engines with routine containers like packages, routines are grouped accordingly so you don’t lose that organization. The tool doesn’t just list routines; it also tries to connect them to the tables they reference. Routine-to-table dependency discovery depends on the database exposing dependency views or routine bodies, so coverage varies, but when the information is there SchemaCrawler adds those edges to your dependency graph.

Triggers are recorded with their timing (before/ after/ instead of) and the events they respond to (insert/ update/ delete). Where possible, SchemaCrawler differentiates per-row versus per-statement triggers. Triggers are also tied back to the tables they act on, which helps with impact analysis — especially in systems that encode auditing or business logic in triggers rather than in application code.

Sequences are modeled as independent objects with their attributes (increment, min/ max, cache), and they’re linked to the columns they feed when the database conveys that relationship. This, alongside identity metadata on columns, gives a complete picture of key generation strategies in your schema.

Many real-world schemas rely on indirection, so SchemaCrawler supports synonyms and aliases. It resolves synonyms to their base objects when permissions allow, but it keeps both the synonym and its target visible so you understand the indirection layer without losing traceability. This is particularly useful when documenting shared schemas or legacy systems that route through synonyms.

Domain-specific type systems are also represented. Where the database supports user-defined types, domains, enums, or composite or row types, SchemaCrawler surfaces them and shows where they are used. This helps you see the semantic layer beyond raw SQL types and understand where constraints are centralized in reusable types rather than repeated on individual columns.

Security and documentation metadata are included out of the box. Object-level and column-level privileges and grants are read from the catalogs, along with ownership. Comments and descriptions are pulled in wherever the platform stores them — either via standard comment fields or engine-specific extended properties — so your generated docs can reflect the human-authored knowledge that usually lives in the database.

Two more pieces round out the picture: text and storage. Collation and character set details appear where the engine exposes them, which matters for sorting, comparisons, and cross-locale applications. Storage-related attributes such as tablespaces or similar placement hints are modeled so you can see enough to reason about where data lives and how it’s grouped, when the database provides that information.

Finally, SchemaCrawler doesn’t stop at “hard” relationships. It will infer weak associations between tables when conventional foreign keys are missing — typically by applying naming conventions like matching “foo_id” to table “foo”. These inferred links are clearly marked as such so you can distinguish between enforced constraints and best-guess relationships, which is invaluable when documenting or modernizing legacy schemas that grew without strict referential integrity.

All of this is controllable. Information levels let you decide how deep to go — basic objects only for a quick overview, or the full graph including triggers, routines, synonyms, and sequences when you need comprehensive docs and impact analysis. Inclusion and exclusion rules help you focus on the schemas and object types that matter to you. And, as with any metadata tooling, the fidelity depends on database version, JDBC driver, and the privileges of the user running SchemaCrawler; the more the database exposes, the more SchemaCrawler faithfully reflects.