ãã®ã·ãªãŒãºã§ã¯ãPythonã®æ§ã ãªæŽ»çšã®æ¹æ³ã玹ä»ããŠããŸãã
ä»åã¯ãEasyOCRãã䜿ã£ãŠãç»åããããã¹ããèªã¿åãæ¹æ³ã玹ä»ããŸãã
å®éã«OCRæè¡ã䜿ã£ãŠã¿ãŸãããã
Google colabã䜿çšããŠç°¡åã«å®è£ ããããšãã§ããŸãã®ã§ããã²æåŸãŸã§ã芧ãã ããã
ä»åã®ç®æš
ã»EasyOCRãšã¯
ã»EasyOCRã®åºæ¬çãªäœ¿ãæ¹
ã»EasyOCRã®ç²ŸåºŠæ¹å
ã»OCRã®å¯èŠå
- 1. EasyOCRãšã¯
- 1.1. OCRãšã¯
- 1.2. EasyOCRãšã¯
- 2. EasyOCRã®åºæ¬çãªäœ¿ãæ¹
- 2.1. å°å ¥
- 2.2. ç»åã®æºå
- 2.3. OCRã®å®è£
- 2.4. åºåçµæã衚圢åŒã§åºå
- 3. EasyOCRã®ç²ŸåºŠæ¹å
- 3.1. easyocr.Reader()ã§äœ¿çšããåŒæ°
- 3.2. reader.readtext()ã§äœ¿çšããåŒæ°
- 3.3. 粟床æ¹å
- 3.4. 粟床æ¹åã®ææ³
- 4. OCRã®å¯èŠå
- 5. ãŸãšã
EasyOCRãšã¯
OCRãšã¯
OCRïŒOptical Character RecognitionïŒãšã¯ãç»åå ã®ããã¹ããèªèããã³ã³ãã¥ãŒã¿ãŒäžã§ç·šéå¯èœãªããã¹ãããŒã¿ã«å€æããæè¡ã§ãããã®æè¡ã«ãããçŽã®æžé¡ãããžã¿ã«ç»åã«å«ãŸããæåæ å ±ãèªåçã«èªã¿åãããšãã§ããŸãã
OCRã®ä»çµã¿ã¯ä»¥äžã®ãããªã¹ãããã§æ§æãããŠããŸãããŸããã¹ãã£ããŒãã«ã¡ã©ã䜿ã£ãŠç»åãåã蟌ã¿ãŸããæ¬¡ã«ããã®ç»åã«å¯ŸããŠååŠçãè¡ããŸããããã«ã¯ãç»åã®åŸãè£æ£ããã€ãºé€å»ãã³ã³ãã©ã¹ã調æŽãªã©ãå«ãŸããŸããååŠçãè¡ãããšã§ãããã¹ãèªèã®ç²ŸåºŠãåäžããŸãã
ç¶ããŠãç»åå ã®æåé åãç¹å®ããåã ã®æåãåãåºããŸãããã®éçšã§ã¯ãæåã®èŒªéãæ€åºããæåã®åœ¢ç¶ãåæããŸããåãåºãããæåã¯ãç¹åŸŽæœåºã¢ã«ãŽãªãºã ãçšããŠç¹åŸŽãã¯ãã«ã«å€æãããŸãã
æåŸã«ãæ©æ¢°åŠç¿ã¢ãã«ã䜿ã£ãŠãç¹åŸŽãã¯ãã«ããå®éã®æåãèå¥ããŸãããã®æ©æ¢°åŠç¿ã¢ãã«ã¯ã倧éã®æåç»åããŒã¿ãçšããŠäºåã«èšç·ŽãããŠããŸããèå¥ãããæåã¯ãããã¹ãããŒã¿ãšããŠåºåãããŸãã
OCRã¯ãçŽã®æžé¡ã®ããžã¿ã«åãååºç®¡çãèªåããŒã¿å ¥åãªã©ãããŸããŸãªåéã§æŽ»çšãããŠããŸããPythonã§ã¯ãTesseractããOpenCVãšãã£ãã©ã€ãã©ãªãçšããŠOCRãå®è£ ããããšãã§ããŸãããããã®ã©ã€ãã©ãªã䜿ãã°ãæ¯èŒçç°¡åã«OCRã·ã¹ãã ãæ§ç¯ã§ããŸãã
ãã ããOCRã®ç²ŸåºŠã¯ãç»åã®å質ãæåã®çš®é¡ãã¬ã€ã¢ãŠãã®è€éããªã©ã«å€§ããäŸåããŸããææžãæåãè€éãªèæ¯ãæã€ç»åã§ã¯ãèªè粟床ãäœäžããåŸåããããŸãããã®ãããOCRãé©çšããéã¯ãç»åã®å質ãé«ããããã®ååŠçããèªèçµæã®åŸåŠçãéèŠãšãªããŸãã
OCRã¯ãããã¥ã¡ã³ãã®é»ååãèªååãé²ããäžã§éåžžã«æçšãªæè¡ã§ããPythonã䜿ã£ãŠOCRã·ã¹ãã ãæ§ç¯ããããšã§ãçŽã®æžé¡ããããŒã¿ãå¹ççã«æœåºããæ¥åã®çç£æ§ãåäžãããããšãã§ããã§ãããã
EasyOCRãšã¯
EasyOCRã¯ãPythonã§æžããã䜿ããããå åŠåŒæåèªè(OCR)ã©ã€ãã©ãªã§ããEasyOCRã䜿ãã°ãç»åãã¹ãã£ã³ããããã¥ã¡ã³ãå ã®æåãç°¡åã«èªã¿åãããšãã§ããŸãã
EasyOCRã®å€§ããªç¹åŸŽã¯ã80以äžã®èšèªããµããŒãããŠããããšã§ããè±èªãæ¥æ¬èªã¯ãã¡ãããäžåœèªãã¢ã©ãã¢èªãããªã«æåãªã©ãäžçäžã®ããŸããŸãªèšèªã®æåãèªèããããšãã§ããŸããèšèªãæå®ããã ãã§ããã®èšèªã®ã¢ãã«ãèªåçã«ããŠã³ããŒããããããã«äœ¿çšã§ããããã«ãªããŸãã
ãŸããEasyOCRã¯ãGPUã䜿ã£ãé«éåŠçã«ã察å¿ããŠããŸãã倧éã®ç»åãåŠçããå Žåãªã©ã«ãåŠçé床ãå€§å¹ ã«åäžãããããšãã§ããŸããGPUã䜿çšããªãå Žåã§ããCPUã§ã®åŠçãå¯èœã§ãã
EasyOCRã®äœ¿ãæ¹ã¯éåžžã«ã·ã³ãã«ã§ãããŸããReaderãªããžã§ã¯ããäœæããèªèãããèšèªãæå®ããŸããæ¬¡ã«ãreadtextã¡ãœããã«ç»åã®ãã¹ãç»åããŒã¿ãæž¡ãã ãã§ãèªèçµæãååŸã§ããŸããèªèçµæã¯ãåããã¹ãã®åº§æšãèªèãããããã¹ããèªèã®ä¿¡é ŒåºŠããªã¹ãã§è¿ããŠãããŸãã
EasyOCRã¯ãæå ç«¯ã®æ·±å±€åŠç¿ã¢ãã«ã䜿çšããŠãããããé«ãèªè粟床ãå®çŸããŠããŸãããŸããã³ãŒãããªãŒãã³ãœãŒã¹ã§å ¬éãããŠãããããéçºè ã¯ã¢ãã«ãèªç±ã«ã«ã¹ã¿ãã€ãºããããæ°ããèšèªã«å¯Ÿå¿ããããããããšãã§ããŸãã
詳现ïŒhttps://github.com/JaidedAI/EasyOCR
EasyOCRã®åºæ¬çãªäœ¿ãæ¹
ããããã¯Google colabç°å¢ã§é²ããŠãããŸãã
å°å ¥
ãŸãã¯ãEasyOCRããã€ã³ã¹ããŒã«ããŸãã
!pip install easyocr
ã€ã³ã¹ããŒã«ãå®äºããŸããã
ç»åã®æºå
ãŸãã¯åºæ¬çãªOCRãå®è£ ããŠã¿ãŸãã
ä»åã¯ãã¡ãã®ç»åã䜿çšããŸãã

OCRã®å®è£
以äžã®äŸã§ã¯ãè±èªã𿥿¬èªã察象ãšããŸãã
ãŸããGPUã䜿çšããã«OCRãå®è¡ããŸãã
æåŸã«å®è¡çµæã衚瀺ããŸãã
import easyocr
reader = easyocr.Reader(['en','ja'], gpu=False)
result = reader.readtext('29767855_m.jpg')
result
äžèšã®ã³ãŒãã§ã¯ã以äžã®ããšãè¡ã£ãŠããŸãã
- easyocr.Reader ã§ãOCRã®èšå®ãè¡ããŸãã[‘ja’, ‘en’] ã¯ãæ¥æ¬èªãšè±èªãèªè察象ãšããããšãæå³ããŸãã
gpu=False
ã¯ãGPUã䜿ããã«CPUã§åŠçããããšãæå³ããŸãã - reader.readtext(‘29767855_m.jpg’) ã§ã29767855_m.jpg ãšããç»åãã¡ã€ã«ã«å¯ŸããŠOCRãå®è¡ããŸãã
å®è¡çµæïŒ
[([[923, 589], [1015, 589], [1015, 691], [923, 691]], 'æ', 0.9969421195671124),
([[624.0259849793654, 639.3195309050808],
[924.1861792718654, 593.1081907488701],
[932.9740150206346, 712.6804690949192],
[632.8138207281346, 757.8918092511299]],
'è³æäœ',
0.9984824140989369),
([[649.4780193260012, 766.7390096630006],
[1209.0622660481395, 665.9624206519773],
[1226.5219806739988, 774.2609903369994],
[666.9377339518607, 875.0375793480227]],
'ãã¬ãŒã³ã®ç·Žç¿',
0.9713997451831543),
([[672.8626652879326, 886.6077314031057],
[1159.9401030422644, 806.6564306633506],
[1173.1373347120673, 923.3922685968943],
[686.0598969577356, 1002.3435693366494]],
'lon1ã®æºå',
0.7934918448755321),
([[688.5825808840516, 1013.6408389724568],
[1235.1181535068565, 930.1098007637163],
[1248.4174191159484, 1045.3591610275432],
[701.8818464931436, 1128.8901992362837]],
'MTGè³æå°å·',
0.9891207866280942),
([[734.5060845773462, 1165.5518253732039],
[1268.3019336181942, 1074.9668503015719],
[1283.4939154226538, 1169.4481746267961],
[749.6980663818058, 1260.0331496984281]],
'Aããã«mail',
0.9949341920660215)]
æœåºãããæååãšå¯Ÿå¿ãã座æšãåºåãããŠããããšãããããŸãã
åºåçµæã衚圢åŒã§åºå
åºåçµæãèŠãããããããã衚圢åŒã§è¡šç€ºããŠã¿ãŸãã
import easyocr
import pandas as pd
reader = easyocr.Reader(['en', 'ja'], gpu=False)
result = reader.readtext('29767855_m.jpg')
data = []
for detection in result:
text = detection[1]
confidence = detection[2]
coordinates = detection[0]
x1, y1 = coordinates[0]
x2, y2 = coordinates[1]
x3, y3 = coordinates[2]
x4, y4 = coordinates[3]
data.append({"ããã¹ã": text, "ä¿¡é ŒåºŠ": confidence,
"x1": x1, "y1": y1, "x2": x2, "y2": y2,
"x3": x3, "y3": y3, "x4": x4, "y4": y4})
df = pd.DataFrame(data)
df
å®è¡çµæïŒ
index | ããã¹ã | ä¿¡é ŒåºŠ | x1 | y1 | x2 | y2 | x3 | y3 | x4 | y4 |
---|---|---|---|---|---|---|---|---|---|---|
0 | æ | 0.9969421195671124 | 923.0 | 589.0 | 1015.0 | 589.0 | 1015.0 | 691.0 | 923.0 | 691.0 |
1 | è³æäœ | 0.9984824140989369 | 624.0259849793654 | 639.3195309050808 | 924.1861792718654 | 593.1081907488701 | 932.9740150206346 | 712.6804690949192 | 632.8138207281346 | 757.8918092511299 |
2 | ãã¬ãŒã³ã®ç·Žç¿ | 0.9713997451831543 | 649.4780193260012 | 766.7390096630006 | 1209.0622660481395 | 665.9624206519773 | 1226.5219806739988 | 774.2609903369994 | 666.9377339518607 | 875.0375793480227 |
3 | lon1ã®æºå | 0.7934918448755321 | 672.8626652879326 | 886.6077314031057 | 1159.9401030422644 | 806.6564306633506 | 1173.1373347120673 | 923.3922685968943 | 686.0598969577356 | 1002.3435693366494 |
4 | MTGè³æå°å· | 0.9891207866280942 | 688.5825808840516 | 1013.6408389724568 | 1235.1181535068565 | 930.1098007637163 | 1248.4174191159484 | 1045.3591610275432 | 701.8818464931436 | 1128.8901992362837 |
5 | Aããã«mail | 0.9949341920660215 | 734.5060845773462 | 1165.5518253732039 | 1268.3019336181942 | 1074.9668503015719 | 1283.4939154226538 | 1169.4481746267961 | 749.6980663818058 | 1260.0331496984281 |
ç»åãšåºåçµæãæ¯èŒããŠã¿ããšããè³æäœããšãæãã«åå²ãããŠããŸã£ãŠããããšãããããŸãã
ãŸããã1on1ãã¯ãlon1ãã«ãªã£ãŠããŸã£ãŠããããšãããããŸããã
ããå°ãç²ŸåºŠã®æ¹åãå¿ èŠãšãªãããã§ãã
EasyOCRã®ç²ŸåºŠæ¹å
EasyOCRã§ã¯OCRå®è¡æã®åŒæ°ãèšå®ããããšãã§ããŸãã
以äžã®åŒæ°ãšãã®å 容ã§ãã
easyocr.Reader()ã§äœ¿çšããåŒæ°
åŒæ°å | 説æ |
---|---|
lang_list | èªèãããèšèªã³ãŒãã®ãªã¹ããäŸãã°ã[‘ch_sim’, ‘en’] ã®ããã«æå®ããŸãã |
gpu | GPU ãæå¹ã«ãããã©ãããæå®ããŸããããã©ã«ã㯠True ã§ããgpu=Falseãšããããšã§ãCPUã§ãåäœããŸãã |
model_storage_directory | ã¢ãã«ããŒã¿ãä¿åãããã£ã¬ã¯ããªã®ãã¹ãæå®ããŸããæå®ããªãå Žåãç°å¢å€æ° EASYOCR_MODULE_PATHïŒæšå¥šïŒãMODULE_PATHïŒå®çŸ©ãããŠããå ŽåïŒããŸã㯠~/.EasyOCR/ ã§å®çŸ©ããããã£ã¬ã¯ããªããã¢ãã«ãèªã¿èŸŒãŸããŸãã |
download_enabled | EasyOCR ãã¢ãã«ãã¡ã€ã«ãèŠã€ããããªãå Žåã«ããŠã³ããŒããæå¹ã«ãããã©ãããæå®ããŸããããã©ã«ã㯠True ã§ãã |
user_network_directory | ãŠãŒã¶ãŒå®çŸ©ã®èªèãããã¯ãŒã¯ã®ãã¹ãæå®ããŸããæå®ããªãå ŽåãMODULE_PATH + ‘/user_network’ (~/.EasyOCR/user_network) ããã¢ãã«ãèªã¿èŸŒãŸããŸãã |
recog_network | æšæºã¢ãŒãã®ä»£ããã«ãç¬èªã®èªèãããã¯ãŒã¯ãéžæã§ããŸããããã«ã€ããŠã®ãã¥ãŒããªã¢ã«ã¯ä»åŸäœæãããäºå®ã§ããããã©ã«ã㯠‘standard’ ã§ãã |
detector | æ€åºã¢ãã«ãã¡ã¢ãªã«èªã¿èŸŒããã©ãããæå®ããŸããããã©ã«ã㯠True ã§ãã |
recognizer | èªèã¢ãã«ãã¡ã¢ãªã«èªã¿èŸŒããã©ãããæå®ããŸããããã©ã«ã㯠True ã§ãã |
reader.readtext()ã§äœ¿çšããåŒæ°
ãã©ã¡ãŒã¿å | 説æ |
---|---|
image | å ¥åç»åãæååãNumPyé åããã€ãåã®ããããã§æå®ã |
decoder | 䜿çšãããã³ãŒããŒã’greedy’ïŒè²ªæ¬²æ³ïŒã’beamsearch’ïŒããŒã æ¢çŽ¢ïŒã’wordbeamsearch’ïŒåèªåäœã®ããŒã æ¢çŽ¢ïŒããéžæãããã©ã«ã㯒greedy’ã |
beamWidth | ‘beamsearch’ãŸã㯒wordbeamsearch’ã䜿çšããéã«ä¿æããããŒã ã®æ°ãããã©ã«ãã¯5ã |
batch_size | ããããµã€ãºã1ãã倧ããå€ãæå®ãããšEasyOCRã®åŠçé床ãåäžããŸãããããå€ãã®ã¡ã¢ãªãæ¶è²»ããŸããããã©ã«ãã¯1ã |
workers | ããŒã¿ããŒããŒã§äœ¿çšããã¹ã¬ããæ°ãããã©ã«ãã¯0ã |
allowlist | èªèå¯Ÿè±¡ã®æåãå¶éããæååãç¹å®ã®åé¡ïŒãã³ããŒãã¬ãŒããªã©ïŒã«åœ¹ç«ã¡ãŸãã |
blocklist | èªè察象ããé€å€ããæåãæå®ããæååãallowlistãæå®ãããŠããå Žåã¯ç¡èŠãããŸãã |
detail | åºåã®è©³çŽ°åºŠãæå®ã0ã«ãããšç°¡æåºåã«ãªããŸããããã©ã«ãã¯1ã |
paragraph | çµæãæ®µèœãšããŠãŸãšãããã©ãããããã©ã«ãã¯Falseã |
min_size | ãã¯ã»ã«åäœã§ãããããå°ããããã¹ãããã¯ã¹ããã£ã«ã¿ãªã³ã°ããŸããããã©ã«ãã¯10ã |
rotation_info | EasyOCRãåããã¹ãããã¯ã¹ãå転ãããæã確信床ã®é«ããã®ãè¿ãããšãèš±å¯ããŸãã90ã180ã270ã®å€ãå©çšå¯èœã§ããäŸãã°ã[90, 180, 270]ãšããããšã§ãèãããããã¹ãŠã®ããã¹ãã®åãã詊ãããšãã§ããŸããããã©ã«ãã¯Noneã |
contrast_ths | ãã®å€ããäœãã³ã³ãã©ã¹ãã®ããã¹ãããã¯ã¹ã¯ãå ã®ç»åãš’adjust_contrast’ã®å€ã«èª¿æŽãããç»åã®2åã¢ãã«ã«æž¡ãããŸãããã確信床ã®é«ãæ¹ãçµæãšããŠè¿ãããŸããããã©ã«ãã¯0.1ã |
adjust_contrast | äœã³ã³ãã©ã¹ãã®ããã¹ãããã¯ã¹ã«å¯Ÿããã¿ãŒã²ããã®ã³ã³ãã©ã¹ãã¬ãã«ãããã©ã«ãã¯0.5ã |
text_threshold | ããã¹ã確信床ã®éŸå€ãããã©ã«ãã¯0.7ã |
low_text | ããã¹ãã®äžéã¹ã³ã¢ãããã©ã«ãã¯0.4ã |
link_threshold | ãªã³ã¯ç¢ºä¿¡åºŠã®éŸå€ãããã©ã«ãã¯0.4ã |
canvas_size | æå€§ç»åãµã€ãºããã®å€ãã倧ããç»åã¯ãªãµã€ãºãããŸããããã©ã«ãã¯2560ã |
mag_ratio | ç»åã®æ¡å€§çãããã©ã«ãã¯1ã |
slope_ths | ããŒãžãæ€èšããæå€§ã®åŸãïŒdelta y/delta xïŒãäœãå€ã¯ãåŸããããã¯ã¹ãããŒãžãããªãããšãæå³ããŸããããã©ã«ãã¯0.1ã |
ycenter_ths | yæ¹åã®æå€§ã·ããéãç°ãªãã¬ãã«ã®ããã¯ã¹ã¯ããŒãžãããã¹ãã§ã¯ãããŸãããããã©ã«ãã¯0.5ã |
height_ths | ããã¯ã¹ã®é«ãã®æå€§å·®ãéåžžã«ç°ãªãããã¹ããµã€ãºã®ããã¯ã¹ã¯ããŒãžãããã¹ãã§ã¯ãããŸãããããã©ã«ãã¯0.5ã |
width_ths | ããã¯ã¹ãããŒãžããããã®æå€§æ°Žå¹³è·é¢ãããã©ã«ãã¯0.5ã |
add_margin | ãã¹ãŠã®æ¹åã®ããŠã³ãã£ã³ã°ããã¯ã¹ãç¹å®ã®å€ã ãæ¡åŒµããŸããããã¯ãã¿ã€èªã®ãããªè€éãªæåãæã€èšèªã«éèŠã§ããããã©ã«ãã¯0.1ã |
x_ths | paragraph=Trueã®å Žåã«ããã¹ãããã¯ã¹ãããŒãžããããã®æå€§æ°Žå¹³è·é¢ãããã©ã«ãã¯1.0ã |
y_ths | paragraph=Trueã®å Žåã«ããã¹ãããã¯ã¹ãããŒãžããããã®æå€§åçŽè·é¢ãããã©ã«ãã¯0.5ã |
粟床æ¹å
å ã»ã©ã®ç»åã«å¯ŸããŠãåŒæ°ã倿ŽããŠãç²ŸåºŠãæ¹åããŸãã
- link_threshold ã 0.3 ã«èšå®ïŒlink_threshold ã¯ãåèªéã®ãªã³ã¯ç¢ºä¿¡åºŠã®éŸå€ã衚ããŸãããã®å€ãäžããããšã§ãåèªéã®é¢é£æ§ãããäœãå Žåã§ãããããã1ã€ã®æç« ãšããŠãŸãšããå¯èœæ§ãé«ããªããŸããã€ãŸããæç« ã®é£ç¶æ§ã«å¯Ÿããå€æåºæºãç·©ããªããŸãã
- mag_ratio ã 1.1 ã«èšå®:mag_ratio ã¯ãç»åã®æ¡å€§çã衚ããŸããããã©ã«ãå€ã¯ 1 ã§ããã1.1 ã«èšå®ããããšã§ãç»åã10%æ¡å€§ããŠåŠçããŸããããã«ãããå°ããªæåãããèªèãããããªããŸãããã ããæ¡å€§ã«ãã£ãŠç»åã®å質ãè¥å¹²äœäžããå¯èœæ§ããããŸãã
import easyocr
reader = easyocr.Reader(['en','ja'], gpu=False)
result = reader.readtext('29767855_m.jpg', link_threshold=0.3,mag_ratio=1.1)
å®è¡çµæïŒ
index | ããã¹ã | ä¿¡é ŒåºŠ | x1 | y1 | x2 | y2 | x3 | y3 | x4 | y4 |
---|---|---|---|---|---|---|---|---|---|---|
0 | è³æäœæ | 0.997967541217804 | 623.8005305310426 | 643.0013263276064 | 1014.6341038492525 | 577.9684645465434 | 1025.1994694689574 | 693.9986736723936 | 635.3658961507475 | 759.0315354534566 |
1 | ãã¬ãŒã³ã®ç·Žç¿ | 0.977445970814249 | 650.8525919657369 | 766.3115551794422 | 1207.1576651173118 | 668.3966122371344 | 1224.147408034263 | 773.6884448205578 | 667.8423348826882 | 872.6033877628656 |
2 | 1on1ã®æºå | 0.5931251849512019 | 672.1005050633884 | 890.1005050633884 | 1158.952099161938 | 807.6855736622541 | 1173.8994949366117 | 919.8994949366116 | 687.0479008380621 | 1001.3144263377459 |
3 | MTGè³æå°å· | 0.9954317802241432 | 689.5825808840516 | 1013.6408389724568 | 1236.101159914352 | 929.0644545490311 | 1249.4174191159484 | 1045.3591610275432 | 702.8988400856481 | 1129.935545450969 |
4 | Aããã«mail | 0.9986604964795984 | 732.8582797093769 | 1163.5433118837507 | 1267.3045006624968 | 1073.9740510719075 | 1282.1417202906232 | 1168.4566881162493 | 747.6954993375032 | 1259.0259489280925 |
å ã»ã©ãšæ¯èŒããŠãæ£ããæåèªèãã§ããŠããããšãããããŸããã
粟床æ¹åã®ææ³
OCRã®ç²ŸåºŠã¯ãç»åã®å質ãæåã®çš®é¡ãªã©ã«ãã£ãŠå€§ããå€åããŸãã以äžã®ãããªã³ãã䜿ãããšã§ãç²ŸåºŠãæ¹åããããšãã§ããŸãã
- ç»åã®å質ãäžããïŒè§£å床ãé«ãããã€ãºã®å°ãªãç»åãçšæããããšã§ãOCRã®ç²ŸåºŠãåäžããŸãã
- ç»åãé©åãªå€§ããã«æ¡å€§ã»çž®å°ããïŒmag_ratio ãã©ã¡ãŒã¿ã䜿ã£ãŠç»åã®å€§ããã調æŽããããšã§ãå°ããªæåã§ãèªèãããããªããŸãããã ããæ¡å€§ãããããšããã£ãŠç²ŸåºŠãäžããããšãããã®ã§æ³šæãå¿ èŠã§ãã
- æåã®è²ãšèæ¯ã®ã³ã³ãã©ã¹ããäžããïŒæåãšèæ¯ã®è²ã®å·®ã倧ããã»ã©ãOCRã®ç²ŸåºŠãäžãããŸãã
- èªèå¯Ÿè±¡ã®æåãå¶éããïŒæ°åã®ã¿ãèªèãããå Žåãªã©ãèªè察象ãå¶éããããšã§äžèŠãªèªèçµæãæžããããšãã§ããŸããallowlist ãã©ã¡ãŒã¿ã䜿ããŸãã
- åèªãè¡ããŸãšããïŒlink_threshold ãã©ã¡ãŒã¿ã調æŽããããšã§ãåèªãè¡ãããŸãçµåã§ããããã«ãªããŸãã
OCRã®å¯èŠå
åºåçµæãæ ç·ã§å²ãã§å¯èŠåããããšãã§ããŸãã
ç°¡åã®ãããå ã»ã©ã®ç»åãå転ããŠãããŸãã

import cv2
import easyocr
# ç»åãèªã¿èŸŒã
image = cv2.imread('29767855_m_rot.jpg')
# OCRãå®è¡
reader = easyocr.Reader(['en','ja'], gpu=False)
result = reader.readtext('29767855_m_rot.jpg', link_threshold=0.3,mag_ratio=1.1)
# çµæãå
ã®ç»åã«æç»
for (bbox, text, prob) in result:
# 確çã50%以äžã®å Žåã®ã¿æç»
if prob >= 0.5:
# æ ç·ã®åº§æšãååŸ
(top_left, top_right, bottom_right, bottom_left) = bbox
top_left = (int(top_left[0]), int(top_left[1]))
bottom_right = (int(bottom_right[0]), int(bottom_right[1]))
# æ ç·ãæç»
cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2)
# çµæãä¿å
cv2.imwrite("result_image.jpg", image)
åºåçµæïŒ

OCRã«ãã£ãŠæåèªèãããéšåãå¯èŠåããããšãã§ããŸããã
ãŸãšã
æåŸãŸã§ã芧ããã ãããããšãããããŸããã
easyOCRã䜿ãã°ãåå¿è ã§ãç°¡åã«OCRãå®è£ ããããšãã§ããŸãã
粟床æ¹åã®ã³ããæŒãããŠãç»åã®ååŠçã工倫ããããšã倧åã§ãã
OCRã¯æ§ã ãªå Žé¢ã§æŽ»çšã§ããæè¡ãªã®ã§ããã²è²ã ãªç»åã§è©ŠããŠã¿ãŠãã ããã