Automated PDF projects
Before you begin
How it works
Blackout provides the ability to scan a large number of PDFs for sensitive information based on a provided set of words, terms, phrases, or regular expressions. Users will select a saved search, a markup set, input or upload rules, and Blackout will scan all of the searchable text content in each PDF.
Any area of the PDF that is comprised of text searchable blocks or properties will be scanned and subsequently redacted or highlighted based on the configured rules. This includes page text, headers, footers, text-based annotations, form data, field properties and more.
What to consider
- Blackout can only automatically mark-up text searchable blocks in a PDF. It will not scan images for text in automated projects.
- Images and multimedia content can be manually redacted during QC. To learn more, see the manual markup documentation.
- Headers, footers, and form data are a part of a PDFs text body and do not require additional options to be scanned
Creating a PDF project
Getting started
- Separate out the PDFs to be redacted into their own saved search
- Ensure that the required markup set for the case has been created in Relativity
- Validate that you have access to the Blackout tab for the case
- There is a wide range of workflows that can be utilized to accomplish different markup goals. However, we will focus on the basic tooling and how to create and run the project for this guide.
- Rules are not required to create a project
PDF project setup
- Navigate to Blackout | Projects
- Click Create New Project
- Select PDF
- Fill out the project form using the details below
- Click Create
- A form will be displayed that is used to define how the project will run
The project create form
Field | Description |
---|---|
Project Name | An identifier for the project, making it easy to refer to or for others to find. |
Saved Search | The document source. This should comprise the documents that will be scanned for sensitive information. |
Markup Set | The markup set that markups created by the project will be associated with. |
Redact All Attachments | Checking this option will result in all attachments from being removed from each document in the set. |
Rules | Rules are what instruct a project in which information should be marked up. How they are configured will provide instruction to the project on how to place markups on matched content. There are two different kinds of rules, redaction and highlight. Rules are separated by groups where each group will contain a set of instructions for the specified terms, phrases, or regular expression. A project can be comprised of multiple groups, each with different instructions. Every time the fields are filled out for a rule group, another section will appear below it. Rules are not required to create the project. This is especially useful is uploading rules via CSV. |
Redact | Redaction rules remove the underlying content from the document and place a black, white, or text redaction where the removed content was. |
Highlight | Highlight rules create a rectangle in the foreground of the matched content with the selected color. |
Markup Reason | A simple message that can be associated with the markups made by the rule group. It can be reviewed using the Blackout reports. |
Markup Scope | Defines the markup behavior when the project matches a rule.
|
Markup SubType | The style of markup to place whether to place. For redactions, the available options are black, white, or text markup. For highlights, the options are yellow, blue, green, orange, pink, and purple. |
Word/Phrase | 'Word/Phrase' are the words, phrases, and text that will be marked up across the document set.
|
Regex | Regular expressions can be used to identify important patterns like email addresses, social security numbers, credit card numbers, and any other content that may appear in a regular pattern throughout the document set. Regular expressions require a name and the expression to be valid. After saving the project, these regular expressions will be available to be selected by name on other projects within the same case. |
dtSearch | dtSearch includes special characters and other operators that you can use to define search criteria.
|
Uploading rules from CSV
After successfully creating the PDF project, rules can be created by uploading a filled out CSV template.
What to know
- Only a valid project name, saved search, and markup set are required to create a project before uploading rules via CSV
- Blackout can support up to 100,000 rules via CSV.
- More rules mean more processing time so take this into when preparing the project
- Rules loaded through CSV behave exactly as rules input on the create project screen
- Rules loaded through CSV do not appear in the rules list due to the number of rules supported
- To review CSV rules, the CSV that is loaded into the project can be downloaded and reviewed
The CSV template and explanation of the columns can be viewed and downloaded from the Milyli Support Center.
How to do it
- From the project view screen, click the Upload rules CSV button.
- Navigate to the completed CSV.
- Click the upload icon.
- If the CSV file is valid, the rules will be created and the button will display how many rules have been uploaded.
Running a PDF project
What to know
- When a project is in a valid state to run, with a name, saved search, markup set, and rules, the status message of the project view screen will display that is ready to run.
- PDF projects share the first come first serve queue with other Blackout projects
- After clicking the run button, the project will be queued immediately
- If there are other projects in queue, they will be completed before the project starts running
- Once the project is next, it will begin processing
- While the project is running, PDFs will be distributed to high resource workers across all available Blackout agents that have high resource enabled
- Similar to Excel projects, the PDF project will keep track of any document that it cannot successfully redact and will provide a report after it completes the project
While the project is running
While the project is running, information about the project will be displayed.
Message | Description |
---|---|
Documents prepared | The number of documents that have been successfully added to the system for markup review |
Documents completed | The number of documents that have been marked up successfully |
Progress bar | Displays the number of actions completed for preparation and review.
|
Time elapsed | The total time the project has processed |
Current activity | The current activities that the Blackout agent is performing which include preparation, reviewing, and marking up.
|
Stop | Begins the stop operation, which will cancel all unfinished work for the project.
|
After the project completes, a results page displays information about the completed work and provides a launchpad for further quality review.
Project results
When the project completes, information about the most recent and previous project runs will be available. The following table provides an overview of the different data.
Message | Description |
---|---|
Documents marked up | The total number of documents that have had markups placed on them by Blackout using the rules from this run.
|
Markups placed | The total number of markups that have been placed, both redaction and highlight, by Blackout using the rules from this run.
|
Documents with warnings | The number of documents that encounter a non-project breaking warning.
|
History |
|
Reverting a PDF project
What to know
- Occasionally the markups created by a PDF project may need to be reverted.
- This need may arise when rules need to be modified or a case settles.
Reverting the project
- From the project view screen, click the Revert button
- A dialogue will appear; confirm reverting the project
- The project will be queued
- The Blackout agent will pick it up and begin reverting the documents marked up by the project
- A progress bar will display how many documents have been reverted