DoxCycle Extraction / Post August 2025 Update

Accounting · October 1, 2025, 11:37pm

Hello,

I am reaching out about the data extraction with DoxCycle. I have noticed that an update has been pushed last August, so I went to test it out on a 2024 and a 2025 file. I still noticed some issues with the data extraction accuracy or some issue with not extracting data on some slips. I wonder if you have a suggestion to improve the data extraction (other than the quality of the document, which we have limited control on).

Also, for the OCR updates:

Is it layout/template‑fixes, or has there been any AI/layout‑aware OCR added (e.g., models that “understand” form structure)?
If not, are there plans to incorporate more advanced document AI / layout‑based recognition (for example, what Microsoft’s Form Recognizer, Google Document AI, or models like LayoutLM do)?
Would the timing of these updates planned before the next tax season?

Thanks so much in advance!

sarka · October 2, 2025, 12:30am

Data extraction accuracy is reliant on a number of factors including the quality of the document image, the accuracy of the OCR in picking up all of the data and a whole lot of programming logic. We are currently working on replacing some of the hard coded data extraction logic with a data extraction model that will be more tolerant of slip changes over time and is optimized for DoxCycle’s OCR.

For the OCR:

We replaced the OCR supplier in August of last year (2024) when our contract expired because we could not justify the price increase the previous supplier was asking for. The accuracy of the new OCR in testing was as good or better than the old one, but we acknowledge that with any change there is a risk of introducing new bugs. There is still some fine tuning we need to do related to the OCR.
We have had some preliminary discussions about moving to an AI based LLM but no decisions have been made. We have a lot of ground to cover to understand the upfront/ongoing costs, effort and level of customer interest.
Yes, assuming there are no big surprises that divert resources, we are planning to release DoxCycle updates to improve both the document classification and data extraction accuracy in time for the coming tax season.

Accounting · October 2, 2025, 12:36am

Thanks! Looking forward to it! It can make such a big difference during tax season!

mgbtax · October 2, 2025, 12:58am

Thanks! I too look forward to it. I use DoxCycle with almost all my files, mainly because not all slips are on CRA when returns are filed and also to have a link between data and the return, which saves a step when looking for something.

Accounting · October 2, 2025, 8:29pm

Hello,

I discussed it with other firm owners. I am pretty sure a lot of us would be willing to pay a Premium for a perfect or almost perfect OCR. It would save so much time in labor (which is hard to find during tax season).

Right now, the American companies like Soraban and Stansford tax charge a fee per returns for an organized slip collection and data extraction.

We can’t rely on CRA for timely slips, and the AFR really just serves as verification.

A great OCR would be the best step in automating tax return preparation and would be a great competitive advantage for TaxCycle.

I thought I would share Thanks for everything

Accounting · November 28, 2025, 7:28pm

Hello,

I heard your response in the webinar yesterday about data extraction. Would you be able to share what is the current accuracy rate for data extraction? Compared to last year? What is expected for this upcoming tax season?

Thanks in advance

Topic		Replies	Views
Doxcycle / Issues with Data Extraction DoxCycle suggestion , taxcycle-general , custom-fields	11	226	March 7, 2026
DoxCycle data entry DoxCycle suggestion	3	81	December 3, 2025
Scanned slip recognition DoxCycle	16	282	November 28, 2025
DoxCycle Issues? DoxCycle	1	59	December 2, 2025
DoxCycle Suggestions (2021 T1s) Product Suggestions	35	1184	March 15, 2022

DoxCycle Extraction / Post August 2025 Update

Related topics