Resources

 

Tools

 

 

 

Data resources

 

 

 

Research and publications

 

    McKellar, C.A. & Groenewald, H.J.  2012.  Frequency-based data selection for statistical machine translation with scarce resources.  (In Ndinga-Koumba-Binza, H.S. & Bosch, S.E., eds. Language Science and Language Technology in Africa: Festschrift for Justus C Roux.  Stellenbosch: Sun Media.  p. 271-290).

     

    Wilken, I., Griesel, M. & McKellar, C.A.  2012.  Developing and improving a statistical machine translation system for English to Setswana: linguistically-motivated approach.  (In De Waal, A., ed. Proceedings of the Twenty-Third Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Pretoria, 29-30 November 2012. pp. 114).

     

    Griesel, M.  2011.  Sintaktiese herrangskikking as voorprosessering in die ontwikkeling van 'n Engels na Afrikaanse statistiese masjienvertaalsisteem.  Potchefstroom: NWU. (Dissertation - M.A. in Applied Language and Literary Studies).

    McKellar, C.A.  2011.  Dataselektering en -manipulering vir statistiese Engels-Afrikaanse masjienvertaling.  Potchefstroom: NWU. (Dissertation - M.A. in Applied Language and Literary Studies).

     

    Snyman, D.P., Van Huyssteen, G.B. & Daelemans, W.  2011.  Automatic genre classification for research scarce languages.  (In Robinson, P. & Nel, A., eds. Proceedings of the Twenty-Second Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Vanderbijlpark, 22-25 November 2011. p. 132-137).

     

    Griesel, M., McKellar, C.A. & Prinsloo, D.  2010.  Syntactic reordering as preprocessing step in statistical machine translation from English to Sesotho sa Leboa and Afrikaans.  (In Nicolls, F., ed. Proceedings of the Twenty-First Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, 22-23 November 2010. p. 105-110).

     

    Groenewald, H.J. & Du Plooy, L.  2010.  Processing parallel text corpora for three South African language pairs in the Autshumato Project.  (In De Pauw, G., Groenewald, H.J. & De Schryver, G.M., eds. Proceedings of the Second Workshop on African Language Technology (AfLaT), Valetta, 18 May 2010. pp. 27).

     

    Pienaar, W. & Snyman, D.  2010.  Spelling checker-based language identification for the eleven official South African languages.  (In Nicolls, F., ed. Proceedings of the Twenty-First Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, 22-23 November 2010. p. 213-218).

     

    Groenewald, H.J. & Fourie, W.  2009.  Introducing the Autshumato integrated translation environment.  (In Màrquez, L. & Somers, H., eds. Proceedings of the 13th Annual Conference of the European Association for Machine Translation (EAMT), Barcelona, 14-15 May 2009. p. 190-196).

     

    Van Huyssteen, G.B. & Pilon, S.  2009.  Rule-based conversion of closely-related languages: A Dutch-to-Afrikaans converter.  (In Nicolls, F., ed. Proceedings of the Twentieth Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, 30 November-1 December 2009. p. 23-28).

     

    Pienaar, J.A. & Tyers, F.  2008.  Extracting bilingual word pairs from Wikipedia.  (In Proceedings of the Speech and Language Technologies for Minority Languages Workshop (SALTMIL) at the Sixth International conference on Language Resource and Evaluation (LREC), Marrakech, 26 May-1 June 2008. p. 19-22).

     

    Van Huyssteen, G.B., Puttkammer, M.J., Pilon, S. & Groenewald, H.J.  2007.  Using machine learning to annotate data for NLP tasks semi-automatically.  (In Orasan, C. & Kuebler, S., eds. Proceedings of International Workshop on Computer-Aided Language Processing (CALP) at the International Conference of Recent Advances in Natural Language Processing (RANLP), Borovets, 27-29 September 2007).