eMarketingPapers
Home
About Us
List Your Papers
    
> Research Library > SPSS, Inc. > Mastering New Challenges in Text Analytics

Mastering New Challenges in Text Analytics

White Paper Published By: SPSS, Inc.

This paper briefly defines text analytics, describes various approaches to text analytics, and then focuses on the natural language processing techniques used by text analytics solutions.



Tags : 
spss, text analytics, data management, statistical analysis, computational linguistics, web sites, blogs, wikis

SPSS, Inc.
Published:  Mar 31, 2009
Type:  White Paper
Length:  24 pages

Technical report
Mastering New
Challenges in Text Analytics
Making unstructured data ready for predictive analytics
Table of contentsIntroduction........................................................................................................................... 2What is text analytics and how is it used?.............................................................................. 3Approaches to understanding text......................................................................................... 4The SPSS text analytics process............................................................................................. 5Applying text analytics at the enterprise level...................................................................... 17Conclusion.......................................................................................................................... 17 SPSS products for text analytics........................................................................................... 18About SPSS Inc.................................................................................................................... 18Appendix A: An explanation of some text analytics terms.................................................... 19Appendix B: Algorithms used for assigning equivalence classes.......................................... 21Appendix C: Examples of Text Link Analysis......................................................................... 22Additional reading on text analytics..................................................................................... 23
SPSS is a registered trademark and the other SPSS products named are trademarks of SPSS Inc. All other names are trademarks of their respective owners. © 2008 SPSS Inc. All rights reserved. MCTWP-0408IntroductionIt's no secret that the world has seen an explosion of information in the past 15 years, an explosion that experts predict will continue as the millions of people who use online resources continue to expand their usage, and the millions of people who do not yet have access to such resources gain it. Similarly, information stored as text in both business and government organizations has grown exponentially.
To name just a few examples: n Opinion surveys are increasingly conducted online and results shared in real time n The boom in software applications supporting sales, customer service, or call center operations has led to massive amounts of text stored electronically in these applications' notes fields n Technology analysts at IDC estimate that 62 billion e-mails are sent every dayn Searchable Web sites generate enough information every day to fill millions of books n Web logs (blogs) and wikis, created by individuals and groups for personal and professional purposes are increasing exponentially: as of this writing, there may be more than 100 million blogs, with a new one created every second
Such a vast expansion of the scale of global information exchange would have been almost unimaginable 40 years ago, when most business and government communications, as well as news reports and advertising, were paper-based.
Yet it was 40 years ago that visionary researchers began to seek ways to enrich the knowledge of those working in medicine and other sciences, in government agencies, and in business by making it possible to uncover previously unknown connections in large collections of textual documents by using computer technologies. They created the discipline known as computational linguistics, which is now practiced at numerous universities and public and private research centers worldwide. Computational linguists initially focused their efforts on finding ways to categorize and explore concepts found in books, scholarly journals, legal briefs, patent applications, newspapers, reports, and other paper-based records that could be converted to digital formats. More recently, their efforts have expanded to include ways to "mine" the vast amount of textual information that is published digitally-online editions of newspapers, academic journals, and conference proceeding, for example. In addition, there is a wealth of content that originates in digital form-such as Web sites, blogs, wikis, e-mails, instant messaging (IM), as well as text embedded in forms, surveys, and in scientific, government, or corporate databases.
There is a growing recognition t... [download for more]

Browse Marketing Topics

    

E-commerce

E-commerce solutions, Payment processing, Shopping cart software, Trust and security  
    
    

Internet Marketing

Content Management Systems, Interactive Marketing, Marketing Software, Web Analytics, Webinars & Web Conferencing  

Marketing Research

Business Intelligence, Reputation Monitoring, Market Research, Usability  
    

Traditional Marketing

Branding, Data Management/Analytics, Lead Generation & Automation, Direct Mail/Marketing, Trade Shows/Events, Other  
    
Search Research Library