Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
public:gsoc:ocr [2018/02/14 23:26]
cfsmp3
public:gsoc:ocr [2018/02/14 23:29]
abhinav95
Line 23: Line 23:
 We will provide all the samples and access to a high speed server that has them so the student can work on it (optional) if a fast internet connection is not available to them.  We will provide all the samples and access to a high speed server that has them so the student can work on it (optional) if a fast internet connection is not available to them. 
  
-__**Qualification ​task**__\\+__**Qualification ​tasks**__\\
 [[https://​github.com/​CCExtractor/​ccextractor/​issues/​929|Terrible OCR results with Channel 5 (UK)]]\\ [[https://​github.com/​CCExtractor/​ccextractor/​issues/​929|Terrible OCR results with Channel 5 (UK)]]\\
 This task is ideal to get started, because you only need to deal with one function in one file: [[https://​github.com/​CCExtractor/​ccextractor/​blob/​930ca716ca0bdae629ddd170abbcc2ad75472422/​src/​lib_ccx/​ocr.c|quantize_map]]() in src/​lib_ccx/​ocr.c This task is ideal to get started, because you only need to deal with one function in one file: [[https://​github.com/​CCExtractor/​ccextractor/​blob/​930ca716ca0bdae629ddd170abbcc2ad75472422/​src/​lib_ccx/​ocr.c|quantize_map]]() in src/​lib_ccx/​ocr.c
 +
 +In addition to the samples that we already have, we would also like the creation of a dataset of a few hardsubbed (videos with burned-in subtitles) videos with the accurate timed transcripts so that we can evaluate the performance of our code on a wide variety of these real world samples. For the qualification task, this does not have to be huge. A good representative set will do fine.
  
 __**Related GitHub Issues**__\\ __**Related GitHub Issues**__\\
  • public/gsoc/ocr.txt
  • Last modified: 2018/02/14 23:32
  • by abhinav95