top of page

Interview with Dallas Cao, creator of GT4T

1. Please tell me a little about yourself.


I’m a bit of an anomaly in the commercial translation industry. I don’t have skills to be a translator. I’ve never worked for an LSP or agency. I don’t come from language academia and I don’t have a PhD in any kind of linguistics. Yet, I’ve worked with translators and interpreters all my life.


I started my career as a technical operations officer in the Central Intelligence Agency. My specialty was technical surveillance. That is, the use of audio and visual technologies to surveil people. Today, such surveillance is common with a service like Amazon’s Alexa that continuously listen the audio in your home, capture what you say and take action as it identifies various commands.


Long before Alexa, I used the same technologies. Instead of a computer listening for voice commands, I worked with translators who created English transcripts of audio recordings for US policymakers.


What is your role in the translation industry?


I approach the technical challenges faced by translators from my experiences in the intelligence community. First, I learn a technology’s realistic capacity. Then, I find or create business processes to achieve the technology’s maximum benefits.


My approach is backwards from commercial technology vendors. They start from a business demand and then apply technology to satisfy that demand. That’s why my products and services aren’t like anything else in the commercial translation technology market.


I’ve defined my role to create MT technologies for a positive, holistic impact on the translator in those moments while he/she works at the keyboard doing their translation work. In moment when reviewing the raw MT suggestion, the only question to answer is, “does this raw MT represent me?”


2. How would you define machine translation?


I think of “machine translation” (MT) as a distinction, not a definition. Our industry guides clients to use their terminology clearly and consistently, but we’re sloppy in how we use our own term “MT.” I’m constantly listening and adapting to how people use the term.


I’ve observed translators generally use “MT” as a noun for the raw translations generated by a computer. By comparison, technology vendors generally use the term as a verb for a computer’s translation process or the technology.


In both cases, the raw MT segments are judged as “good” quality based on subjective criteria (see item 7 below). As long as a translator expects “good” MT, they’re competing with the technology.


Ironically, MT software does not use quality-generating algorithms. Rather, it uses predictive algorithms, like the ones used in weather forecasting. A meteorologist feeds sensor data into the software. The software predicts tomorrow’s weather. It’s only a “good” prediction after we experience tomorrow’s actual weather. It would be ridiculous to say the software generated good or bad weather.


Likewise, a translator feeds a source segment (sensor data) into the software. The MT software predicts a translator’s translation. The prediction is “good” or “bad” only after a translator experiences and judges the raw MT.


How has MT progressed over the years?


From 1951 to 2006, that’s 55 years, first-generation MT software (rules-based MT, RbMT) predicted almost none of a translator’s work. MT as a professional translator’s tool was stagnant for a long time.


In 2006, the second-generation MT software (statistical MT, SMT) significantly improved in its ability to predict. In 2006, it needed the power of expensive server computers. SMT software spread online as a service (Google, et al). To service millions of users per minute, these online services optimize the software for general-purpose use. In that process, they degraded the MT quality for professional use.


When used by professionals, general-purpose SMT systems predict roughly 5% of a translator’s work. By comparison, customizing an SMT system for a vertical specialization elevates predictions to as much as 30% of a translator’s work.


The third-generation MT software (neural MT, NMT) was launched in 2017. When used by professionals, NMT still predicts roughly 5% of a translator’s work. Some translators report the remaining 95% is subjectively better for problematic languages, such as English-Chinese.


What has not progressed?


Although MT has improved, our post-editing “best-practices” have not. The first machine translation post-editing (MTPE) guidelines to correct raw MT for publication were developed in the early 1960s. That was an era when MT systems predicted virtually none of the translator’s work. The ISO 18587:2017 post-editing specification codifies those guidelines, virtually unchanged since 1966.


3. What is your advice to the novice translator on how to approach machine translation?


We’re entering a new era when today’s novice and future translators will never have known a time without MT. It’s like today’s youth who never knew the phonograph or rotary dial telephone.


Change is inevitable. Today’s novice translators need to plan for unknowns and not design their translation career around the technology.


  1. Learn the craft of translation without technology assistance.

  2. Learn to find customers, sell your devices and support those customers.

  3. Learn the business side of translation, time management, etc.

  4. Learn which technologies assist you and learn how to use them.

  5. Be prepared for change. Today’s tech might not be available next year, LILT.com for example.

  6. Look for alternate technologies. Technology vendors won’t tell you about them.


With regards to MT, linguistic corpora are the foundation of all current and foreseeable MT technologies. Translation memories are the easiest source for corpora. Therefore, novice translators should save all of their work in private translation memories.


A private TM immediately helps the translator in traditional CAT tools. Predictive technologies, like MT, predict your work based on your work previous work. Your personal translation memories store that previous work. After growing for 2 or 3 years, the TM will be a great foundation for new assistive MT technologies.


4. What is your advice to the established translator who is looking into machine translation?


I think most “established translators” are set in their ways. They know which CAT tools they like and/or need for customers. They use MT services or decided to never use MT. There’s still a large group between novice and established who can benefit from looking at MT from a new perspective.


Is MT an option or a must?


Translation technology, including MT, is never a “must.” Even today, many translators don’t use CAT tools. Some revert to pencil and paper to escape from the mesmerizing keyboard and monitor. Here’s a better question. What technologies serve your goals and satisfy the customer’s requirements?


Every translator is different. MT helps many finish more work in less time. Some translators believe they work slower with a translation memory and MT. So, let’s look at some productivity numbers working 8 hour per day with an average of 10 words per segment.


  • 2,500 words per day, 120 seconds per segment – translator’s traditional benchmark for years.

  • 3,500 words per day, 90 seconds per segment – LSP shared they expect from translators.

  • 8,000 words per day, 30 seconds per segment – some translators boast on social media, without using translation memories or MT.


As technologies improve and more translators work faster, the 2,500 words per day traditional benchmark will not satisfy market demands. Although MT is not a “must,” it is one of many available tools that some translators use satisfy increasing demands for higher throughput.


I suggest experienced translators who are looking into MT should understand that there are different kinds of MT applications with different results. Expand your horizons beyond the online services. Explore opportunities with customized MT. Compared to 12 years ago, customized SMT is simple and affordable, and it’s better quality than the online services.


5. Are translators earning more money with machine translation or less than before?


Like many things, the answer starts with “it depends.” The only way to earn more money as a direct result of MT would be to receive higher word-rate when using MT.


In reality, agencies and clients pay lower word rates when they know a translator uses MT. Therefore, if a translator who uses MT actually earns more money, the increase is indirectly related to the MT. I’ve identified three basic scenarios.


Agency-provides the raw MT: The agency pays for raw MT and pays post-editors lower word rates or by the hour for MT post-edit (MTPE) services. The translator’s final earnings are higher when he/she achieves higher productivity that offsets lower word rates.


Translator pays for the raw MT: The translator subscribes to an online MT service and works with the raw MT via the CAT’s API. Most often, the API flags the segments as post-edited MT. Customers use those flags to pay lower word rates. Again, higher earnings, if any, come from productivity gains that offset lower word rates and additional MT subscription costs.


Translator pays nothing for raw MT: The translator buys an MT software application that uses the API to feed MT to a CAT tool, just like the online service. The application, however, does not flag the segments as post-edited MT. The customer pays full word rates because there are unaware the translator uses MT. The translator’s increased earnings come from higher productivity at full word rates and without paying MT subscription fees.


6. What are some of the pitfalls of translating with machine translation tools?


I’m the wrong person to ask, but I’ll share what I learned from my customers.


Is it really as easy as pressing a few hotkeys and letting the software do the job?


No, it’s not about hotkeys and automation. I’ve observed that the biggest pitfall is a translator using MT is to set realistic expectations. Metaphorically, it’s difficult to see the forest for the trees. That is, being language-focused by nature, translators fall into the trap of trying to fix each and every MT segment.


The best MT systems predict only 30% of a professional translator’s work. That means 70% of the raw MT needs editing. The pitfall is trying to “fix” every one of those segments. That’s a lot of editing.


To maximize productivity, a translator needs to balance editing versus translating anew. He/she needs to learn how to quickly decide to fix a segment or translate anew.


7. People are constantly getting examples of hilarious translations made by Google Translate.


Is it really that bad?


Are there better online machine translation tools?


Are any of them free?


We need to acknowledge that judging the quality of any expressed language is subjective. People perceive the quality of their native language differently. Translators judge translation quality differently. Judging language quality often becomes an IKIWISI proposition (I know it when I see it) because one translator sees a “good” translation while another sees the same translation as “bad.”


Is GT really that “bad?” Some studies report Google NMT predicts a translator about 5% of the time. Customized SMT predicts the translator up to 30% of the time. That means 70% to 95% of the time, the raw MT segments are somewhere between poor and really bad. It easy to find “hilarious translations.”


I don’t track any of the online services. I measure Google’s NMT results because it’s a well-known reference for my customers’ experiences.

Many online MT services are free from a web browser, but using their API from a CAT tool comes at a cost. For example, Google’s API costs $20 per million characters (including spaces) regardless of the language. If you discard bad MT segments, you already paid for them. For a full-time career freelancer, Google usage can cost $10 per month or more.


Intento (https://inten.to/) is a company that specializes in tracking the quality and changes across dozens of online MT services. They periodically publish reports that also include rates for the various services.


8. Tell us a little bit about your own MT software: Slate Desktop (by Slate Rocks LLC).


Slate Desktop is an MT software application that runs on your PC. It’s like having a private Google translate without the internet.


Slate Desktop converts the translator’s translation memories into an MT service provider That provider shows up in your CAT tool’s API. Then, you use it like any other provider.


What does it do that other tools do not?


You by Slate Desktop for a one-time license fee. There are no subscriptions, no usage fees. Once you buy Slate Desktop, the raw MT segments are free.


Slate Desktop’s MT uses your words and your style of writing, not the words and style imposed by the online MT services. It learns your words and style from your translation memories.


Slate Desktop is customized SMT. Its quality predicts up to 30% of a translator’s work. The quality is directly related to the size and quality of your translation memories that it converts to engines.


Higher prediction rates result in higher productivity.


As a desktop application, you are the only one who knows you’re using MT. It does not flag your work as post-edited MT and your customers can not discount your word rates.


All of your customers’ information stays on your computer. Slate Desktop never sends segments to the Internet in violation of your customer’s security policies.


9. What do you predict about MT’s future and the translators who use it.


I think our industry has crossed a threshold. MT is here to stay. It’s not an afterthought anymore. It’s the logical next step in computer-assisted translation. I shared some predictions in earlier questions.


Today’s applications are primitive. MT started in the early 1950s but it’s been a commercially viable technology for professionals only in the last decade. By comparison, IBM released the first PC in 1981. WordPerfect was the word processing standard for the first 10 years. Today, the de facto standard is MS Word but it didn’t exist until 1991, 10 years after the first IBM PC.


I’m certain tomorrow’s MT technologies will look and function very differently from today’s tools. Tomorrow’s applications will be as different from today’s CAT tools as today’s smartphones are different from yesteryear’s telegraphs with Morse code.


MT will change in ways we cannot imagine today, but it will not replace translators. Translators who use MT as an assistive tool will increase their earning power, provided their increased productivity offsets the lower word rates and additional costs.

4 views0 comments

Recent Posts

See All

I'm not good at computers

Please don't say that, not to me, not to a potential client, and not even yourself. You see, underestimating your computer skills is the equivalent of a monolingual translator saying that he or she do

Comments


bottom of page