CV parsing is software that reads a candidate's CV and pulls the details into structured fields. It finds the name, contact details, work history, education, and skills, then sorts them into standard categories your system can search and store. In plain terms, it turns a messy document into clean, organised data.
Resume parsing is the automated process of extracting relevant information from a resume and converting it into a structured format. Algorithms or AI identify the key details and place them into standardised categories, often as XML or JSON, which then load into an ATS or HRIS. That structure is what makes a pile of CVs searchable instead of a folder of files you have to open one by one.
This guide is for busy recruiters, not engineers. You will see how parsing works step by step, what makes it accurate or messy, and what to do so the CVs you handle parse cleanly, both inside your tools and inside your client's ATS.
Key takeaways
- CV parsing is software that reads a CV and turns it into structured data fields like name, experience, education, and skills.
- Parsers read text from left to right and top to bottom, so columns, tables, and text boxes can scramble the output.
- Real selectable text parses well. Images, scans, and logos do not, unless OCR reads them first.
- Standard section headings like Experience, Education, and Skills tell the parser how to sort the content.
- Always check the extracted fields before you send a CV to a client, because a bad parse can cost a candidate the shortlist.
Why parsing matters to recruiters
Parsing decides whether a candidate gets seen. When you submit a CV to a client, their system parses it too. If the layout confuses the parser, the candidate's experience can land in the wrong field or vanish. A strong candidate can drop out of the shortlist for a formatting reason, not a skills reason. In Jobscan's 2025 analysis, 97.8% of Fortune 500 companies were found to use an applicant tracking system, so the document you send is almost always read by a machine first.
Parsing also saves you time on the way in. Instead of retyping a candidate's details into your CRM or template, a parser reads the file and fills the fields for you. The catch is accuracy. Parsers read a document as a continuous, linear stream of text, moving left to right and top to bottom, so a two-column design can come out jumbled. Knowing how parsing behaves lets you avoid the layouts that break it and check the output before it costs you a placement.
How CV parsing works, step by step
Step 1: Ingest the file
The parser takes in the document, whether it is a PDF, Word file, or another format. It first works out what kind of file it is so it knows how to open and read it.
Step 2: Read the text, or OCR a scan
If the file has real selectable text, the parser reads it directly. If the CV is a scan or image, OCR runs first to convert the picture into text the parser can work with.
Step 3: Detect the sections
The parser looks for structure. It uses headings and patterns to find where the work history starts, where education sits, and where the skills are listed. Standard headings make this far easier.
Step 4: Extract the fields
Within each section it pulls out specific details: name, contact details, job titles, employers, dates, qualifications, and skills. These become individual data fields rather than one block of text.
Step 5: Map into the ATS or CRM
The structured fields are imported into your system, often as JSON or XML behind the scenes. Now the CV is searchable and ready to reformat, filter, or submit.
What affects parsing accuracy
Single column beats columns and tables
Parsers read straight across the page, left to right and top to bottom. With tables or columns the parser can slice across the whole page instead of reading cell by cell, which scrambles the text or drops it. Keep the layout in one column.
Real text beats images and scans
A parser reads selectable text. It cannot pull information from graphics, logos, or images. A scanned or photographed CV is just a picture, so it needs OCR to turn the image into text before any field can be read.
Standard section headings help sorting
Use plain headings like Experience, Education, and Technical Skills. These tell the parser how to sort the content into the right categories. Invented headings leave the parser guessing.
Headers and footers can get dropped
Many parsers ignore or lose content placed in the document header or footer. If the phone number or email sits up there, it may never get extracted. Keep contact details in the main body.
File type matters
Clean, text-based formats parse best. A .docx or a text-based PDF holds real selectable text, and ATS systems often prefer .docx. A flattened or scanned PDF gives the parser nothing to read without OCR.
Plain fonts read more reliably
Standard, common fonts map to text cleanly. Decorative or unusual fonts and heavy styling can confuse extraction. Simple formatting gives the parser the best chance to read every word.
Do this for clean parsing
- Use a single-column layout so the parser reads top to bottom without jumping around.
- Send real, selectable text in a .docx or a text-based PDF, never a scan or screenshot.
- Use standard section headings like Experience, Education, and Skills so the parser sorts the content correctly.
- Keep contact details in the main body of the document, not in the header or footer.
- Stick to plain, common fonts and simple formatting instead of heavy styling.
- Check the extracted fields after parsing and fix any wrong or missing data before you export.
- Before you submit, confirm the CV is one a client's ATS can read, so the candidate is not filtered out for a layout reason.
Common parsing mistakes to avoid
Trusting a scanned or image CV
A scanned PDF or a photo of a CV is a picture, not text. Without OCR the parser reads nothing and the fields come back empty. Always work from a text-based file, or run OCR first.
Keeping columns and tables
Two-column designs and tables look tidy to a human but break parsers. The parser can read straight across the page and mix the columns together, so the output is out of order or missing.
Hiding details in headers and footers
Contact details placed in the document header or footer often get dropped completely. The candidate then looks unreachable in the system even though the details were on the page.
Not checking the extracted fields
Parsing is fast but not perfect. If you do not review the output, a wrong job title or a missing employer slips through. A quick check before export catches these errors.
Sending an unparseable CV to a client's ATS
If you forward a CV that your own parser struggled with, the client's ATS will struggle too. The candidate can drop off the shortlist for a formatting reason, not a skills reason.
Using creative headings the parser cannot recognise
Inventive section titles confuse the sorting step. The parser may not realise where the work history is. Plain headings keep the structure clear.
Frequently asked questions
What is CV parsing?
CV parsing, also called resume parsing, is the automated process of reading a CV and extracting the relevant information into a structured format. Software identifies key details such as name, contact details, work experience, education, and skills, then sorts them into standard categories like JSON or XML. Those fields load into an ATS or CRM, which makes the CV searchable instead of a file you have to read by hand.
How does resume parsing work?
The parser ingests the file, reads the text, or runs OCR if the CV is a scan. It then detects sections using headings and patterns, finds where experience, education, and skills sit, and extracts the specific fields like job titles, dates, and qualifications. Finally it maps those fields into your ATS or CRM. The cleaner the layout and headings, the more accurate each of these steps becomes.
Why does a CV parse badly?
Usually because of layout or file type. Parsers read left to right and top to bottom, so tables and columns can scramble the text or drop it. Images, logos, and scanned pages have no text to read without OCR. Content in headers and footers often gets lost. Creative headings and unusual fonts also confuse the parser. A single column of plain, real text fixes most of these problems.
Do all ATS parse CVs the same way?
No. Each system has its own parser, so results vary. Some handle a layout fine while others struggle with the same file. Because you cannot control which system a client uses, the safe approach is to send a simple, single-column, text-based CV with standard headings. That format gives every parser the best chance to read it correctly, whatever ATS the client runs.
How do I make a CV parse correctly?
Use one column, real selectable text, and standard section headings like Experience, Education, and Skills. Save it as a .docx or a text-based PDF, not a scan or image. Keep contact details in the body, not the header or footer, and use plain fonts. After parsing, check the extracted fields and correct anything wrong before you export or submit. These steps keep the CV readable for any parser.
The bottom line
CV parsing is simple to picture once you see it as software reading a CV and turning it into organised fields. It ingests the file, reads the text or OCRs a scan, finds the sections, pulls out the details, and loads them into your system. Accuracy comes down to the document. A single column of real text with standard headings parses cleanly. Columns, tables, scans, and content stuck in headers cause trouble. Since almost every client reads a CV through an ATS before a human does, the format you send matters as much as the content. Keep CVs parser-friendly and always check the extracted data before it leaves your hands.
Parsing is the first thing RefineCV does for you. It reads a candidate's CV with AI extraction, including OCR for scans, across 10 input formats, then lets you check and fix the extracted data before you reformat it into your branded template. It exports a clean text-based PDF or DOCX that stays readable for a client's ATS. See how it works or transparent pricing. Try it free on 10 CVs, no card.
Parse, check, and reformat in one place
RefineCV extracts a candidate's details, lets you fix them, and exports a clean ATS-readable CV. Try it free with 10 CVs, no credit card.
Related reading: how to make a candidate CV ATS-friendly and the recruitment CV template.
Sources
- AIHR (Academy to Innovate HR) HR Glossary (2024): Resume parsing is the automated process of extracting relevant information from a resume into a structured format, where algorithms or AI identify key details and sort them into standardised categories (often XML or JSON) that import into an ATS or HRIS to make resumes searchable.
- Jobscan, Why ATS Tables and Columns Break Your Resume Parsing (2026): Most ATS resume parsers read a document as a continuous, linear stream of text, left to right and top to bottom, so a table makes the parser slice horizontally across the page rather than read cell by cell, causing text-layer scrambling where content is out of order or missing.
- University of Minnesota Duluth Career Center, Applicant Tracking System (ATS) Tips (2024): ATS struggle with tables, text boxes, columns, graphics, logos, and images, and content in headers and footers may be dropped completely; the ATS prefers the .docx file type and standard section headings such as Education, Experience, and Technical Skills so it knows how to sort the information.
- Jobscan, Fortune 500 Use Applicant Tracking Systems (2025): In Jobscan's 2025 analysis, 97.8% of Fortune 500 companies were found to use an applicant tracking system, with 489 of 500 having a detectable ATS.