Microsoft seeks to build more useful thesaurus

Daily Newsletters

Sign up to ZDNet UK's daily newsletter.

NEWS

The Next Generation Writing Assistance project within Microsoft's research unit is aiming to build a more useful thesaurus, by tapping techniques used to translate languages.

Although thesauri are good at finding synonyms, they require the user to pick the right one, since they can't understand context. That's where machine language-translation techniques come in.

"We've taken the actual translation tables... and said: 'If a word in Chinese maps to two different English words, maybe those two words are synonyms, with some probability'," said Chris Brockett, a computational linguist and one of the Microsoft researchers leading the project.

This approach offers two key benefits over using a standard thesaurus: it can handle phrases, as opposed to single words, and it can draw on the context in which a phrase is used.

Brockett plans to show off a prototype of the tool next week at TechFest, Microsoft's annual internal science fair. It's one of dozens of projects that will be shown as part of an effort to expose Microsoft's business units to the work being done in Microsoft's research labs.

As is the case with most of the projects that will be displayed at TechFest, the thesaurus effort is still in its infancy.

"We're still working on the algorithms and how much work we give to the language pairs," Brockett said. "We have to get the quality up. There are usability issues that have to be looked into."

Over time, Brockett hopes the technique could be used to effectively rewrite whole sentences. However, would-be plagiarists should beware. Although the technology could one day translate a whole Wikipedia article for you, it would probably translate the article the same way for everyone else as well. Plagiarism-detection software is also evolving.

The thesaurus technology would be naturally suited to inclusion in Word, which already has a built-in traditional thesaurus.

The technology could also help Microsoft in another key area: search. While search engines are good at finding names, for example, that have just one form, they have more difficulty finding expressions that can be phrased in multiple ways.

Post your comment

In order to post a comment you need to be registered and logged in.

You can also log in with Facebook. Log in or create your ZDNet UK account below

  • Login

Will not be displayed with your comment

By signing up for this service, you indicate that you agree to our Terms and Conditions and have read and understood our Privacy Policy. Questions about membership? Find the answers in the Community FAQ

Get ZDNet UK's daily newsletter

Enter your email address to sign up

ZDNet UK Live

k0tcs3

Sure, that makes perfect sense. Pay wrong-doers money and thank them for breaching your security and pointing out your flaws, that would surely...

21 minutes ago by k0tcs3 on US indicts Romanian over NASA climate change hack
Random_Error

I think he's referring specifically to Android apps, as Apple do regulate their App Store, but Google seem to let any old crap onto the Android store!

26 minutes ago by Random_Error on RIM: BlackBerry will keep 'garbage' apps out of store
Paul Fezziwig

Keep the crap apps out?! How will they compete with Android and Apple's claim to fame of having so many life changing apps? I wonder if the media...

6 hours ago by Paul Fezziwig via Facebook on RIM: BlackBerry will keep 'garbage' apps out of store
Aigars Mahinovs

It has been shown time after time that if there is an author store that sells the songs at even 1$ per song and gives you a high-quality digital...

7 hours ago by Aigars Mahinovs via Facebook on Copyright isn't working, says European Commission
awbMaven

""As a result of Butyka's alleged conduct, researchers were unable to use the computers for more than two months while NASA removed the malicious...

9 hours ago by awbMaven on US indicts Romanian over NASA climate change hack
subhorup

It simultaneously worries me and uplifts me that a self-proclaimed group of internet activists name themselves after Indian mythical figures....

17 hours ago by subhorup on Anonymous activists release PCAnywhere source code
naviathan

It's actually far easier to work anonymously on the internet than you think. With tools like Tor bouncing your traffic around the world before...

21 hours ago by naviathan on Anonymous activists release PCAnywhere source code
Agnostic_OS

1000272134 and bluedalmatian with you both there but then I'm still in 10.04 land (and happy with it)

21 hours ago by Agnostic_OS on Ten factors that make Ubuntu 11.10 a hit
apexwm

Interesting article and definitely see your points on the products mentioned. One of the top products for our Help Desk (approximately 20% of all...

1 day ago by apexwm on Ten flawed products that derail productivity
Paul Hutchinson

Absolutely - this should obviously not be handled my isp - but handled by their hosting operator. What's been suggested here is that my isp police...

1 day ago by Paul Hutchinson via Facebook on MPs urge ISPs to take down terrorist material
Techs UK

Looks like a great phone. I don't notice any deficiencies in WP7. used IOS before, that's pretty good. I don't spend much time in Apps, all i need...

1 day ago by Techs UK on Nokia pins US 're-entry' hopes on Lumia 900
Larry Bloggy

Now with the help of these apps you are always synced with MS outlook while on the move. Just download apps like xobni or outlookreflex and get...

1 day ago by Larry Bloggy via Facebook on Outlook Social Connector beta 2 and the LinkedIn connector
mike40g123

Your details are wrong. The version currently being made is the one with 2 USB ports, 256MB RAM and a network port. This is the Model B. The...

1 day ago by mike40g123 on Raspberry Pi boards set to go on sale
Moley

The thing that has been puzzling me for quite a while is how Anonymous can remain anonymous whilst not only being active on the Internet but also...

2 days ago by Moley on Anonymous activists release PCAnywhere source code
Don Dilly

If what Semantec is saying is rue, that is even worse and shows a complete disregard for thier users. If what Anonymous claims is true and the...

2 days ago by Don Dilly via Facebook on Anonymous activists release PCAnywhere source code
MattChurchy

Didn't seem particularly biased to me either. Oh though you might have mentioned some other competitors with free search and email services...

2 days ago by MattChurchy on Time for an evil umpire: Google, Microsoft & privacy
Simon Bisson and Mary Branscombe

James - exactly as much as anyone paid you for your comment; I don't feel that I need to say that I'm independant and unbiased, but just for you...

2 days ago by Simon Bisson and Mary Branscombe on Time for an evil umpire: Google, Microsoft & privacy
Carl White

Once they realise symantec are willing to pay real money, they will simply keep extorting, unless of course symantec/authorities can use the...

2 days ago by Carl White via Facebook on Symantec offered hackers $50k in source code sting
Jonathan Hassell

You can find more information on BS 8878 by Jonathan Hassell its lead-author at http://www.hassellinclusion.com/bs8878/ The page includes a...

3 days ago by Jonathan Hassell on BSI publishes first British web accessibility standard
servermanagement

Thanks for this list. Now I know, what to include on my system to make it more functional.

3 days ago by servermanagement on Ten flawed products that derail productivity