How to Extract Metadata from PACER Documents
- About the PACER Metadata Extractor Component
- How to Add the PACER Metadata Extractor to AutoClassifier
- How to Configure the PACER Metadata Extractor Component
About the PACER Metadata Extractor Component
The PACER Metadata Extractor pipeline stage extracts legal information from Pacer documents.
How to Add the PACER Metadata Extractor to AutoClassifier
-
Prerequisites: This component needs Tika Extractor Component.
Use the following steps to add the Tika Extractor Component and Pacer Metadata Extractor Component to an AutoClassifier pipeline stage.
- Navigate to the AutoClassifier Pipelines component page.
- Click New Component and select Tika Extractor from the component list:
- Name your new Tika Extractor component and click Add.
- After adding your Tika Extractor, Click New Component and select PACER Metadata Extractor from the component list.
- Name your new PACER Metadata Extractor component and click Add.
- Click Apply to save your changes.
- Ensure your new Tika Extractor and PACER Metadata Extractor components are placed in the list of existing pipeline stages.
How to Configure the PACER Metadata Extractor Component
Prerequisites: The Tika Extractor Component must be configured and the Extract Body and Extract Metadata fields must be enabled.
To configure you PACER Metadata Extractor component, select it from the existing component list and complete the following fields in the Configuration section:
- Enable Court Listener Mappings: This field represents whether the output will contain the CourtId and CourtName equivalent to the CourtListener API. If this setting is enabled, PacerCourtId, PacerCourtName, CourtListenerCourtId, and CourtListenerCourtName are displayed instead of CourtName.
- Enable Extracting Judge Name: This field represents whether the output will contain the judge name. If this setting is enabled, the Judge will be displayed.
- Extract Judge Names Regex Pattern: This field represents the regex that will match the words before the judge name. For example, "hon\.|honorable|district judge"
- Click Apply then Cancel.
Output Properties
Property
Type
Type Text
PublishDate Text
DocumentDisplayNumber Text CourtName Text Cost Text CaseName Text CaseId Text PacerCourtId* Text PacerCourtName* Text CourtListenerCourtId* Text CourtListenerCourtName* Text Judge** Text *PacerCourtId, PacerCourtName, CourtListenerCourtId, CourtListenerCourtName metadata properties are returned only if Enable Court Listener Mappings is set to True. In that case, CourtName is not returned.**Judge metadata property is returned only if Enable Extracting Judge Name is set to True.