·9 min read

How to Automate Data Entry Into Green Screens From PDFs and Scanned Documents

Every day, thousands of organizations receive documents—invoices, purchase orders, work orders, inspection reports—that need to be manually typed into IBM i green screens. It's slow, error-prone, and entirely dependent on operators who know the system. Here's how to automate data entry into green screens from PDFs using AI that understands both documents and terminals.

The Document-to-Terminal Workflow

Before exploring automation, it's worth understanding exactly what happens today in most IBM i environments when a document arrives. The workflow is remarkably consistent across industries, whether you're processing invoices in accounts payable, entering purchase orders in procurement, or logging inspection results in quality management.

A document arrives—as an email attachment, a scanned PDF, a fax, or sometimes still on paper. An operator opens the document on one screen and the IBM i terminal emulator on another. They read a field from the document, switch to the terminal, navigate to the correct screen, type the value, and press Enter or Tab. They repeat this for every field on every screen that the transaction requires. For a typical invoice, this might mean 15–30 fields across 3–5 screens. For a complex work order, it could be 50+ fields across a dozen screens.

Multiply this by hundreds or thousands of documents per day, and you have a process that consumes entire teams of operators, each performing the same mechanical task: read from paper, type into terminal, advance to the next screen, repeat. This is the workflow that AS/400 document processing automation aims to eliminate.

Traditional Approaches and Their Limits

The desire to automate this workflow isn't new. Organizations have tried several approaches over the years, each with significant limitations.

OCR Template Matching

Traditional OCR systems work by defining zones on a document template: “the invoice number is always at coordinates X,Y with dimensions W,H.” This works well for a single document format from a single vendor. It breaks immediately when a vendor changes their invoice layout, when you onboard a new supplier, or when documents arrive in slightly different orientations. Maintaining hundreds of OCR templates for hundreds of vendors becomes a full-time job in itself.

Custom Integration Scripts

Some organizations have built custom scripts—often in RPG, Python, or a middleware tool—that parse specific document formats and feed data into IBM i via data queues, SQL inserts, or API calls. These work reliably for the specific scenarios they were built for, but they're rigid and expensive to maintain. Every new document type needs new development. Every screen change needs script updates. The maintenance burden grows linearly with the number of automated workflows.

RPA Bots

IBM i terminal emulator automation through RPA adds a Windows desktop agent that drives a terminal emulator and reads documents. As we've discussed in our comparison of RPA and AI approaches, this creates a fragile chain of dependencies: a desktop machine, an emulator, and a bot that relies on screen coordinates and pixel positions. RPA bots break when emulators update, when screens change, and when unexpected dialogs appear.

AI Vision: A Different Architecture

IBM i document processing software built on AI takes a fundamentally different approach to both halves of the problem: understanding the document and interacting with the terminal.

Document Understanding, Not Template Matching

Modern AI vision models don't need templates. They read documents the way a human does—understanding structure, context, and meaning. When an AI model looks at an invoice, it doesn't search for text at predetermined coordinates. It identifies the invoice number because it recognizes the label, the format, and the context. It finds the line items because it understands table structures. It reads the total because it understands the relationship between line amounts and summary figures.

This means a single AI model can process invoices from hundreds of different vendors without any template configuration. It handles layout variations, different languages, varying quality from scans, and even handwritten annotations—all without per-vendor setup.

Screen Understanding, Not Coordinate Mapping

On the terminal side, a TN5250 data entry bot powered by AI reads the green screen data stream natively and understands what each screen represents, what fields are available, and what data is expected. Instead of being programmed with “put the vendor number at row 8, column 25,” the AI recognizes the vendor number field by its label and context, enters the data, and handles whatever the screen presents next—including error messages, confirmation prompts, and multi-page entry forms.

End-to-End Automation: What's Different

Traditional approaches automate either the document reading or the terminal interaction, but rarely both in a unified system. AI-native automation connects the two: the same system that understands the document also understands the terminal, so it can map extracted data to the correct screen fields intelligently—handling field formats, required transformations, and validation rules as a single workflow.

Step-by-Step: PDF to IBM i in Practice

Here's what the automated workflow looks like from start to finish when you scan documents and enter data into AS/400 automatically:

1. Document Arrives

A PDF invoice lands in an email inbox, a watched folder, or is uploaded through a web interface. The system accepts any standard format: native PDFs, scanned images, photos from a phone camera, or multi-page documents. No preprocessing or format conversion is needed.

2. AI Extracts Fields

The AI vision model analyzes the document and extracts all relevant fields: header information (vendor, date, document number), line items (part numbers, descriptions, quantities, unit prices), and summary data (subtotals, tax, freight, total). Confidence scores are assigned to each extraction, and low-confidence fields are flagged.

3. Agent Connects via TN5250

The AI agent opens a standard TN5250 session to your IBM i system, authenticating with a designated user profile. This is the same connection method your terminal emulators use—no special protocols, no installed software, no firewall changes beyond what you already permit for remote terminal access.

4. Navigates Menus

The agent navigates from the main menu to the correct data entry screen. It has learned the menu paths for your specific applications and handles intermediate screens, subsystem selections, and library changes as needed. If the navigation path changes—due to a menu restructuring or a new security prompt—the AI adapts without requiring reprogramming.

5. Enters Data

Field by field, screen by screen, the agent enters the extracted data. It respects field lengths, data types, and format requirements. For numeric fields, it handles decimal positions correctly. For date fields, it converts to the format your application expects. For coded fields, it maps document values to the correct system codes.

6. Validates Results

After entry, the agent reads confirmation screens and validates that the data was accepted. If the system returns an error—an invalid vendor code, a duplicate document number, a quantity that exceeds a limit—the agent interprets the error message, determines the appropriate response, and either corrects the entry or escalates to human review.

7. Logs Complete Audit Trail

Every action is logged: which document was processed, what data was extracted, which screens were navigated, what values were entered, and what the system responded. This audit trail is searchable and provides complete traceability from source document to system entry—something that manual data entry rarely achieves.

Supported Document Types

Because AI document understanding is based on comprehension rather than templates, the range of supported documents is broad and growing. Organizations are using automated IBM i data entry for:

Vendor invoices and credit memos
Purchase orders and change orders
Work orders and service requests
Bills of lading and shipping documents
Insurance claims and intake forms
Inspection and compliance reports
Receiving documents and packing slips
Customer order forms and contracts

The common thread is straightforward: if a human operator can read the document and type the data into a green screen, the AI can do the same—faster, without fatigue, and with a complete audit trail for every entry.

Getting Started

The barrier to automating data entry on IBM i is lower than most organizations expect. Because the solution connects via standard TN5250, there's zero installation on your IBM i system. No PTFs to apply, no libraries to install, no exit programs to configure, no RPG changes to make. If your operators can connect to IBM i with a terminal emulator today, the AI agent can connect tomorrow.

Most organizations start with a single document type—typically the highest-volume, most repetitive workflow. Vendor invoice entry is the most common starting point because the documents are structured, the volumes are high, and the ROI is easy to measure. From first connection to first automated document typically takes one day, with production deployment following a brief validation period.

The key decision isn't whether this technology works—it's whether you can afford to keep paying operators to do manually what AI can do autonomously, accurately, and around the clock.

Ready to Automate Your Document-to-Terminal Workflow?

See LegacyBridge process a real document and enter data into a green screen—live, in under two minutes.

Book a Demo