GSoC/GCI Archive
Google Code-in 2013 Apertium

phenny/begiak url module localisation improvements

completed by: reikaze

mentors: Jonathan Washington, Francis Tyers

Phenny/begiak has a module that detects pas ted urls and reports the title of the page. This doesn't work well with non-UTF8-encoded webpages. Fix this so that the titles get properly converted to UTF8 and display as intended. Some titles to test on include Ìàäèåâà - Àâàðñêèé ÿçûê, Ïîèñê â êîðïóñå. Íàöèîíàëüíûé êîðïóñ ðóññêîãî ÿçûêà, Óêðà¿íñüêà ïðàâäà, ×óâàøñêàÿ ðåñïóáëèêàíñêàÿ ãàçåòà «Õûïàð», ÀÎÒ :: Òåõíîëîãèè :: Ðóññêàÿ ìîðôîëîãèÿ