20 March 2008

Localization, Globalization - What's in a word?

I repeat the title: "Localization, Globalization - What's in a word?" Plenty, if you want people to find you on the Web.

If you're a language provider trying to get noticed among search engine results, you know that it's important to choose your words carefully. In my post Top 5 Web searches I outlined several keyword combinations people frequently use to reach this blog and our Web site. Two of the most important terms in our industry deserve some attention.

Globalization/Globalisation. Many vendors use this term, and many of us in the industry apply it to our job titles: "Globalization Consultant," or "Sr. Globalization Manager." There was a time when I used it as a keyword for Web pages, pay-per-click campaigns and even my business card.

I stopped after a while, though. "Globalization" is often associated with rioting by farmers in developing countries, and multinational companies that behave like sovereign states. Yes, I do riot and behave imperiously from time to time, but those are not the services I'm advertising, so the use of that word brought me erroneous visits from people doing research on the World Trade Organization and coffee prices.

If you qualify it with "software globalization" or "Web globalization," the search engine delivers more accurately, but it's still a prickly bit of jargon. Try telling your uncle or somebody you're cold-calling that you're in globalization; they'll think you're talking about leftist politics. The word just gets in the way.

Localization/Localisation. This term is different. As with "globalization," most uncles and persons you cold-call don't know the term; however, it's unlikely that they'll confuse it with something else, because it's unique.

Or so I thought.

I replaced occurrences of "globalization" with "localization" among my keywords and I now receive far fewer mis-visits as a result of search engine results, but I think the mis-visits will increase before long. The English word "localize" means "to fix or assign to a particular place," but it's not a very common term in English. The term is a bit muddy now (because of French), and will get muddier in the future (because of cellular telephony).

Early on, I received queries from France, Italy and Canada using "localisation" in combination with words that suggested "how to locate something" or "where somebody is right now." These people land on my site or blog by mistake; they're trying to find a device for locating merchandise in a warehouse, or want to know how to restrict malaria to a particular region. This is just a case of mistaken (keyword) identity.

With cellular communications, however, come location-based services equipped to help your favorite people (and favorite retailers, it goes without saying) "localize" you. For example:

About Krillion - Krillion is a premier provider of local shopping search information, serving today's ready-to-buy consumers who research products online for purchase from retailers in their area...The powerful combination of our patent-pending Krillion Localization Engine, localized search results covering over 10,000 products in 40,000 U.S. locations, and unique, real-time StockCheck(TM) tool enables consumers to speed their research-to-purchase process and take advantage of in-store pickup services offered by many retailers near them.
(I have no financial interest in Krillion.)

We've all dreamed of the day when businesses would begin to talk more about localization, because we could spend less time educating and more time improving the process. If the term gets muddied, however, we'll spend time explaining which "localization" we offer (language, not global position).

Labels:

10 August 2007

Localization - Top 5 Web searches

What is your most frequently used Web search regarding localization? Are there search phrases you check every now and again to see what new results they yield?

Over the last couple of years, I've tuned the keywords on this blog and on our Web site, www.1-for-all.com, for both pay-per-click and search engine optimization. I have a pretty good idea of which search topics bring people to this blog, and here are the top five topics, with my comments:
  1. Localization of HTML help projects (Robohelp, CHM, etc.). I can't tell whether people have trouble with this, or whether they're poking around to find out whether they are going to have trouble with it once they undertake it. My hunch is that Robohelp, the dominant product for creating HTML help, either doesn't do a good job creating localized help systems or doesn't do a good job in explaining how to create them. Our experience has been that double-byte localization requires a specifically enabled, separate version of Robohelp, which strikes me as silly, but perhaps Adobe has addressed this by now.
  2. How much to charge/pay for translation. Everybody wants to know this. Responding to the frequency of these queries, I wrote an article called "Going Global Without Going Broke" to help people who want a few benchmark figures from which to cobble together a budget. If you're any further along than that, you should just contact a vendor, push your files to him and get an estimate. If you're a translator or want to become one, phone a localization company, tell them what you can do and find out what they'll pay you.
  3. Localization project manager/management. I would guess that about half of these are vendors (a.k.a localization service providers, or LSPs) and half are companies with localization needs to fill.
  4. Localization jobs. Most of these queries come from Ireland. There's a relatively high concentration of localization talent in that country, and perhaps a high rate of turnover as well.
  5. What is localization? Again, the frequency of these queries prompted me to write articles called "Opening the Black Box." I'm glad to see people asking this question, because it demonstrates continuing and continuing interest in this specialty. At the same time, however, I notice that some of these queries come from China and India, suggesting to me that the IT shop which has just promised you it can localize your software for one-seventh the price you've gotten from other vendors, is now trying to figure out what's involved in fulfilling that promise.
At the other end of this list are the searches we're not seeing: questions we believe people should be asking but aren't.

Labels: , , , , ,

20 April 2007

Localization Testbenches, Part IV (Online Help)

What are you using to test your localized products? If you're handing them to your domestic QA team and expecting that they'll intuitively test them with correct language locale settings, you may be in for an unpleasant surprise.

3) Help files
Your online documentation also deserves some testing. After its contents (usually HTML pages or XML documents) have been translated - in the correct encoding for the target language - the help project will be compiled, in the same way that software applications are compiled. This compilation step needs to account for the correct language, locale and encoding, and this doesn't happen by itself, no matter how lucky you may feel today.

Again, it's important to test the help file in an environment that closely matches your customers' environment. Run your Greek help file on a native Greek operating system. Be sure to test the main window, the contents pane and the index for properly displayed characters. Above all, perform a few searches using native characters in the Find field to ensure that your help file's index was properly created and encoded; if your searches are successful, then your customers' searches will probably be successful as well.

Note: HTML Help under Windows has some idiosyncrasies when it comes to the table of contents (TOC) pane and the main window. Most tools like RoboHelp will properly encode the TOC and main pane content for, say Japanese, when all of the content resides in the same project. However, if you're building your HTML help files with your own tools (e.g., Perl scripts and hh.exe), you may find that encoding sauce for the goose is not encoding sauce for the gander. We've found, for example, that the HTML pages displayed in the main window are happy with UTF-8, whereas the TOC pane won't support UTF-8 but will support Shift-JIS.

Labels: , , ,

20 September 2006

Segmentation and Translation Memory

To get the broken sentences in the new files to find their equivalents (or even just fuzzy matches) in translation memory we have three options:

  1. Modify the Perl scripts that extract the text from the header files into the HTML, so that the scripts no longer introduce the hard returns.
  2. Massage the HTML files themselves and replace the hard returns with spaces.
  3. Tune the segmentation rules in Trados such that it ignores the hard returns (but only the ones we want it to ignore) and doesn't consider the segment finished until it gets to a hard stop/period.
To go as far upstream as possible, I suppose we should opt for #1 and fix the problem at its source. This seems optimal, unless we subsequently break more things than we repair. Options #2 and #3 are neat hacks and good opportunities to exercise fun tools, but they burn up time and still don't fix the problem upstream.

Also, I don't want the tail to wag the dog. The money spent in translating false positives may be less than the time and money spent in fixing the problem.

Labels: , , , ,

15 September 2006

Moving the Localization Carpet under the Source Text

Here's the mess I face.

The HTML files are filled with paragraphs formatted like this:

Currently, this function gets called for trust overrides and

client authentication. On client authentication, the supplied

interface contains the server's Certificate Authorities Distinguished

Names list (see references) and the negotiation handler

always gets called so as to give a chance to the client to supply

the correct client certificate based on the DN list.


At the end of each line are two hard returns. It wasn't always this way, so each complete sentence is sitting happily in translation memory. Unfortunately, Trados pulls in each of these six 80-character fragments and calls it low- or no-match because it can't find enough of a concordance. This is a classic case of false positives driving up translation costs.

I'm still exploring options. Meanwhile, there's no sense in starting the translation work.

Labels: , ,

10 September 2006

Who's in trouble: the Localization Vendor or me?

The localization estimate has come back on the HTML files in the API Reference, and it's as ghastly high as I'd feared.

The vendor's project manager does something clever with these thousands of files: she uses SDLX Glue to glue them together into six or seven batches of several hundred files each. That way she avoids carpet-bombing the translator with jillions of files; this also keeps the translator in the translation business and out of the file management business. After translation, the project manager un-glues them using SDLX Glue and hands them off internally for engineering, QA, etc.

The downside to this technique is that the TM analysis comprises only the six or seven files. I can't see down to the level of granularity I want unless I ask for a special analysis down to the file. They don't mind doing it for me, but it's not in their regular workflow and I have to wait for it.

Anyway, the count of unmatched words is preposterously high, and I'm pretty sure it's due to changes in the scripts that extract the HTML from the header files. Sentences and segments in version 4.0 don't match those in the last version because of things like double line-feeds and mucked up HTML tags.

I need to have a deeper look at the original English HTML files and bucket them for handoff. BeyondCompare shows me that the text in some files hasn't changed at all, and I'll need to spoon-feed these to the vendor.

Either that or get shot down when I take this estimate up to the third floor for approval...

Labels: , , ,

05 September 2006

The Lonely Localization Manager

It's a bit strange, the way in which I get my work done.

Naturally, I play the role of localization manager and primary contact at developer meetings, and I manage budget and schedule for several projects at once. I've become the lightning rod for issues ranging from character sets to encodings to what-do-these-Chinese-characters-say. All in all, most localization issues are pretty well in hand because I've been able to manage them in a way that conforms to best practices, with a bit of experimentation thrown in to see how much better we can make things.

I suspect, though, that most localization managers who might read this have teams and staff and reporting structures. I would bet they also spend more time scrapping internally with development managers, QA and upper management, struggling to make localization conspicuous and wildly successful. I just make it work.

As localization manager for a software company in the early 1990's I went through that. I get a lot more done this way, and I enjoy it more.

There was that exception last year. I had a client that drove me bonkers because their entire corporate culture was geared to nothing more than tolerating localization, and that with a clothespin on their nose. Wish I'd been blogging during that engagement; it was a wild ride.

Labels: , ,