INTERFACING INFORMATION
CSLI Industrial Affiliates Conference, May 12-14, 1999
Center for the Study of Language and Information
Cordura Hall, Room 100
Stanford University
http://www-csli.stanford.edu/csli/Tutorials/May99.schedule.html
Wednesday, May 12, 1999
-----------------------
9:00 - 9:15 Welcoming Remarks, John Perry, Director of CSLI
9:15 -10:30 Participants' Presentations
10:30 -11:00 Break
11:00 -12:30 Keynote Address: DOUGLAS ENGELBART, Bootstrap Institute
http://www.bootstrap.org/org.htm#1
******* 12:30 - 1:30 Lunch, demos
1:30 - 2:30 Methods of Eyetracking and Analysis: The
Advanced Eye Interpretation Project
Greg Edwards, CSLI,http://www-csli.stanford.edu/~gedwards/
Andrew DeVigal, Poynter Institute, http://www.poynter.org/
The phrase "the eyes are the windows to the mind" points to
the benefit of using eyetracking to understand users. However, given that
people move their eyes a lot -- 2 to 4 times a second, typically --
methods and infrastructure are needed to make sense of this rich source of
information. This session will focus on the software approach we
are taking to record, analyze, and understand what people's eyes do when
they are working naturally in a computing environment. In
addition to the software we will discuss hardware requirements. This
session will provide background information for how specific projects to
understand users can be set up.
2:30 - 3:00 Break
3:00 - 5:00 Eyetracking and Online Design
Andrew DeVigal, Poynter Institute
Greg Edwards, CSLI
What are the current challenges of presenting news and
information online and how can tracking where people look on news sites
help us? This session will focus on what the online news
industry is facing today in terms of visual presentation and go through
some of the thinking processes of designing online news sites. We will
also review some of the limitations and challenges that online designers
and producers face on a daily basis. Because the Internet is in its
relative infancy, especially in terms of news organizations, we have a
wide variety of different presentations for web sites. We will talk
about those differences and how the results of an eyetracking study may
help us better understand what our users and audiences want.
Thursday, May 13, 1999
----------------------
9:00 -10:30 Introduction to Information Extraction: a Tutorial I
David Israel, Doug Appelt, SRI International
http://www.ai.sri.com/~israel/
This tutorial will cover the what and the how of Information
Extraction (IE) systems. First we characterize the range of tasks
usually intended for IE techniques, and then describe the various
approaches to implementing these techniques, discussing the advantages
and disadvantages of each. Most IE systems process texts in
sequential steps ("phases") ranging from lexical and morphological
processing, recognition and typing of proper names, parsing of larger
syntactic constituents, and resolution of anaphora and coreference.
Finally, IE systems have a domain phase that recognizes events and
relationships relevant to the specific IE task. We shall discuss
various approaches to each of these tasks in turn, and examine their
suitability for different types of IE problems. We will discuss the
problems and advantages of incorporating various external resources
into extraction systems, including large lexicons, gazetteers, and
part-of-speech taggers, and conclude with a discussion of template
design principles that can have a significant impact on the difficulty
of the IE task.
10:30 -11:00 Break
11:00 -12:30 Introduction to Information Extraction: a Tutorial II
David Israel, Doug Appelt, SRI International
http://www.ai.sri.com/~appelt/
******* 12:30 - 1:30 Lunch, demos
1:30 - 3:00 Internet Marketing I
Ward Hanson, Stanford Graduate School of Business
http://gobi.stanford.edu/facultybios/bio.asp?ID=59
This tutorial mixes industry insights, and theoretical approaches to
give a framework based on technological, marketing, and economic
opportunities created by the Internet for understanding and
implementing marketing on the Internet. The goal is to
understand not only how successful Net marketing operates, but also
how existing organizations should combine the Internet with their
traditional marketing approaches. Key issues of this tutorial include
what is different about marketing on the Net, how the different
on-line business models function, how the Net is changing marketing in
"the real world," what barriers exist to consumer adoption, how
consumers behave on-line, and what traditional marketing practices are
most threatened. Several Net marketing themes will be covered. They
include: on-line quality enhancement, personalization,
community-building, real-time marketing, and on-line customer
management.
3:00 - 3:30 Break
3:30 - 5:00 Internet Marketing II
Ward Hanson, Stanford Graduate School of Business
Friday, May 14, 1999
--------------------
9:00 - 9:15 Data Mining and Machine Learning
Pat Langley, CSLI, DaimlerChrysler, and ISLE
http://www-csli.stanford.edu/cll/langley.html
This brief overview will discuss the potential of data mining and
its relation to its component fields, including machine learning,
databases, and visualization. It will also consider some factors
that determine the success of data-mining applications but that are
seldom mentioned in the literature, such as problem formulation,
representation engineering, and data manipulation. Decisions about
these issues are often more important than ones about the particular
data-mining algorithm used in the effort.
9:15 -10:30 Case Studies of Data Mining for Business
Kamal Ali, CSLI and ISLE
http://www.isle.org/~ali
In recent years, data-mining techniques have let corporations take
advantage of large, in-house data repositories that they have accumulated
on customers. Data mining differs from applied statistics and machine
learning by emphasizing the analysis of very large data sets and
outputs that are comprehensible to business users. Successful fielded
applications of data mining have included using decision trees to filter
credit-card applications (American Express), using market-basket link
analysis to determine up-sell and cross-sell opportunities for retail
(Walmart, NeoVista), using clustering techniques for customer retention
(John Hancock, IBM) and using data visualization (Ford, SGI). This
presentation will give a tour of some major types of data-mining methods,
followed by two case studies of their application. The first project,
carried out by ISLE researchers for a major US bank, used both naive
Bayes and decision-tree induction to predict which small-business loans
applicants were likely to default. In the second effort, carried out
by the speaker while at IBM, aimed to find car owners that were likely
to switch insurance companies. The latter project highlights how one
can trade off a more complex, accurate predictive model against a
smaller, more understandable model that was better suited to modify
customer behavior and thus achieve the goals of the insurance company.
10:30 -11:00 Break
11:00 -12:30 Adaptive Bayesian Networks
Moises Goldszmidt, SRI
http://www.erg.sri.com/people/moises
Adaptive Bayesian networks are slowly but steadily being used as the
core representation and inference engine in decision support systems,
data mining applications, and diagnostic engines. Some examples are
fraud detection (AT&T), exploratory data analysis (NASA), end-to-end
customer service (Ricoh), device diagnostics, collaborative filtering,
and email sorter (Microsoft), pattern classification (SRI). There are
three main reasons for their popularity and acceptability: first, they
enable both a compact representation of probability measures and
efficient inference procedures; second, in addition to statistical
information, they represent qualitative and structural information
about the domain of application; third, they are capable of mixing
statistical information coming from the data with knowledge coming
from experts. In this talk I would provide a gentle introduction to
Bayesian networks and to the principles behind their use in pattern
recognition, data mining, and statistical induction. I will focus on a
specific class of networks called TAN which have tractable induction
algorithms, excellent performance, and desirable properties of
robustness (they require low order statistics). These properties make
TAN an excellent candidate for pattern classification/data mining applications.
******* 12:30 Lunch, and end of conference.
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
Registration:
*************
Participant Name & Title : ______________________________________________
Company Name: ___________________________________________________________
Adress: _________________________________________________________________
City: _____________________________________ State: _____ Zip: __________
Country: __________________________________
Tel: ______________________________________ Fax: ________________________
Email: ____________________________________
IAP Member: Yes ______ No ______
If Not an IAP Member : Conference fee is $695.00
Payment Method:
____ Check (Payable to Stanford University*)
____ Money Order
____ Credit card: _____ VISA _____ MC Number: _____________________
Expiration Date: _________________
Cardholder Name (as it appears on card): _______________________________
Signature: _____________________________
PLEASE FAX THIS FORM TO: (650) 723-0758
For other info please contact:
Michele King, CSLI, Ventura Hall, Stanford University, Stanford, CA
94305, Tel: 650-723-3084, Fax: 650-723-0758,
email:[log in to unmask]
|