| wang 的个人资料Nirvana照片日志列表 | 帮助 |
|
9月19日 tempSession 1: Web Search(Room A) A Web Search Method Based on the Temporal Relation of Query Keywords Tomoyo Kage, Kazutoshi Sumiya Meta-Search Based Web Resource Discovery for Object-level Vertical Search Ling Lin, Gang Li, Lizhu Zhou PreCN: Preprocessing Candidate Networks for Efficient Keyword Search over Databases Jun Zhang, Zhaohui Peng, Shan Wang, Huijing Nie Searching Coordinate Terms with their Context from the Web Hiroaki Ohshima, Satoshi Oyama, Katsumi Tanaka Session 2: Web Retrieval(Room B) A Semantic Matching of Information Segments for Tolerating Error Chinese Words Maoyuan Zhang, Chunyan Zou, Zhengding Lu, Zhigang Wang Block-Based Similarity Search on the Web Using Manifold-Ranking Xiaojun Wan, Jianwu Yang, Jianguo Xiao Design and Implementation of Preference-based Search Paolo Viappiani, Boi Faltings Topic-based Website Feature Analysis for Enterprise Search from the Web Baoli Dong, Huimei Liu, Zhaoyong Hou, Xizhe Liu Session 3: Web Workflows(Room C) Fault-tolerant Orchestration of Transactional Web Services An Liu, Liusheng Huang, Qing Li, Mingjun Xiao Supporting Effective Operation of E-Governmental Services through Workflow and Knowledge Management Dong Yang, Lixin Tong, Yan Ye, Hongwei Wu DOPA: A Data-driven and Ontology-based method for Ad hoc Process Awareness in Web Information Systems Meimei Li, Hongyan Li, Lv-an Tang, Baojun Qiu A Transaction-Aware Coordination Protocol for Web Services Composition Wei Xu, Wenqing Cheng, Wei Liu 10:15-10:45, Coffee break 10:45-12:30, Concurrent Sessions: Session 4, Session 5, Session 6 Session 4: Web Services(Room A) Unstoppable Stateful PHP Web Services German Shegalov, Gerhard Weikum, Klaus Berberich Quantified Matchmaking of Heterogeneous Services Michael Pantazoglou, Aphrodite Tsalgatidou, George Athanasopoulos Pattern Based Property Specification and Verification for Service Composition Jian Yu, Tan Pham Manh, Jun Han, Yan Jin, Yanbo Han, Jianwu Wang Detecting the Web Services Feature Interactions Jianyin Zhang, Fangchun Yang, Sen Su Session 5: Web Mining(Room B) Exploiting Rating Behaviors for Effective Collaborative Filtering Dingyi Han, Yong Yu, Guirong Xue Exploiting link analysis with a three-layer web structure model Qiang Wang, Yan Liu, JunYong Luo Concept Hierarchy Construction by Combining Spectral Clustering and Subsumption Estimation Jing Chen, Qing Li Automatic Hierarchical Classification of Structured Deep Web Databases Weifeng Su, Jiying Wang, Frederic Lochovsky Session 6: Performant Web Systems(Room C) A Robust Web-based Approach for Broadcasting Downward Messages in a Large-scaled Company Chih-Chin Liang, Chia-Hung Wang, Hsing Luh, Ping-Yu Hsu Buffer-preposed QoS Adaptation Framework and Load Shedding Techniques over Streams Rui Zhou, Guoren Wang, Donghong Han, Pizhen Gong, Chuan Xiao, Hongru Li Cardinality Computing: A New Step Towards Fully Representing Multi-Sets by Bloom Filters Jiakui Zhao, Dongqing Yang, Lijun Chen, Jun Gao, Tengjiao Wang An Effcient Scheme to Completely Avoid Re-labeling in XML Updates Hye-Kyeong Ko, SangKeun Lee \ Session 1B: Information organization and retrieval(Room B) Blogs in American Academic Libraries: An Overview of their Present Status and Possible Future Use Zhuo Fu Rebuilding the library OPAC Zhigeng Wang Web Content Mining for Market Intelligence Acquiring from B2C Websites Danxiang Ai, Yufeng Zhang, Hui Zuo, Quan Wang Design of Chinese Word Segmentation System Based on Improved Chinese Converse Dictionary and Reverse Maximum Matching Algorithm Liyi ZHANG, Yazi LI, Jian MENG Cross-media Database Retrieval System Based on TOTEM Cheng ZENG, Haiyang Zhou, Bing Yan Session C:Advances in Web-based Learning(Room C) Collaborative User Tracking for Community Organization on Blogosphere: a Case Study of eLearning@BlogGrid Jason J. Jung, Inay Ha, Supratip Ghose, Geun-Sik Jo Adaptive UI Storytelling System using MOL Sunghan Bae, Rohae Myung Construction of a Distributed Learning Resource Management System Based on RSS Technology Chengling Zhao, Liyong Wan, Ying Yu, Qi Luo A Semantic Web Application to Automate The Construction of Didactic Material for Web-based Education System Ruben Peredo, Leandro Balladares and Ivan Peredo Session 5: Web Mining(Room B) Exploiting Rating Behaviors for Effective Collaborative Filtering Dingyi Han, Yong Yu, Guirong Xue Exploiting link analysis with a three-layer web structure model Qiang Wang, Yan Liu, JunYong Luo Concept Hierarchy Construction by Combining Spectral Clustering and Subsumption Estimation Jing Chen, Qing Li Automatic Hierarchical Classification of Structured Deep Web Databases Weifeng Su, Jiying Wang, Frederic Lochovsky Session 8: Web Document Analysis(Room B)
A Latent Image Semantic Indexing Scheme For Image Retrieval On The Web Xiaoyan Li, Lidan Shou, Gang Chen, Lujiang Ou Hybrid Method for Automated News Content Extraction from the Web . Yu Li, Xiaofeng Meng, Qing Li, Liping Wang A Hybrid Sentence Ordering Strategy in Multi-document Summarization Yanxiang He, Dexi Liu, Hua Yang, Donghong Ji, Chong Teng, Wenqing Qi Document Fragmentation for XML Streams based on Query Statistics Huan Huo, Guoren Wang, Xiaoyun Hui, Chuan Xiao, Rui Zhou A Heuristic Approach for Topical Information Extraction from News Pages Yan Liu, Qiang Wang, Qingxian Wang
Session 4: Web Services(Room A) Unstoppable Stateful PHP Web Services German Shegalov, Gerhard Weikum, Klaus Berberich Quantified Matchmaking of Heterogeneous Services Michael Pantazoglou, Aphrodite Tsalgatidou, George Athanasopoulos Pattern Based Property Specification and Verification for Service Composition Jian Yu, Tan Pham Manh, Jun Han, Yan Jin, Yanbo Han, Jianwu Wang Detecting the Web Services Feature Interactions Jianyin Zhang, Fangchun Yang, Sen Su
9月13日 TOEP-5sometimes, i am lucky. :)
not bad
Wang Shu Thank you for participating in the University of California, Irvine, Test of Oral English Proficiency (TOEP). We hope that it was a valuable learning experience. The panel of three members of the UCI academic community (one undergraduate student, one ESL expert, and one faculty member) has conferred and decided on the following recommendation: 5 - Pass. Congratulations. Your test showed that you have good mastery of spoken English. You are eligible to be a TA at UCI. Workshops at LARC and IRC can help you improve your teaching skills. Please note that scores on the TOEP are not negotiable. Your score has been reported to RGS; that office will forward the score to your department. Please contact your department if you need any other specific information on your eligibility to be a TA. A letter with your official score and a sheet with specific feedback on the language features of your test will be mailed to your campus zotcode address. We hope they will be useful to you. 9月12日 要做的几件事情老板马上回来了,自己P都没做,真TMD爽。
后面干点啥呢
1. 读懂DH的Active Rule, 写proposal
2. 继续Semantic Office, 写详细设计
3. 把SD RegE2的设计和代码 拿过来用到Semantic Office上。
争取2周 跟大家把这个事情结了
4. 整合以下代码
A. SD RegE2的文本处理(Page Parser, Sentence Spliter)
B. SD RegE2的DB Accessing 模块
C. Semantic Computer的SO Accessing模块
D. JLinkGrammar的JNI模块 包括那个改过的DLL
然后 用WebZip抽取得结果 做个实验
5. Nutch & Lucene, 让我拥有大天使的气息吧
上次搞到半途而废,谁想看看这个怪人Doug Cutting(http://nutch.sourceforge.net/blog/cutting.html)
每次看到这个长得像外星人得家伙,还有他那简短的自我介绍,我就不由得感受到震撼。
我的目的不仅仅是知道这个开源的SearchEngine, 其实目前已经知道了
更重要的是,从灵魂上理解她。说句有点诗意的话:
她是被魔鬼抚养的公主
不要幻想去征服她
因为即使想接触她的灵魂
你都必须拥有大天使的气息
6. 准备做个research plan, 我想这个应该在我读完Sergey Brin and Lawrence Page他俩的那篇经典之作后吧
7. 读书
<千万别把我当人>
<From a wiseguy to a wiseman>
<Call of the wild>
基本上就这些了,其他搞点小技巧的东西就不上来炫了。用牛人的话结尾
Because humans can only type or speak a finite amount, and as computers continue improving, text indexing will scale even better than it does now. Of course there could be an infinite amount of machine generated content, but just indexing huge amounts of human generated content seems tremendously useful. So we are optimistic that our centralized web search engine architecture will improve in its ability to cover the pertinent text information over time and that there is a bright future for search. --------------------------------------------------------------------------------------------------- 直到今天, 我还没有弄清楚, 那是真的发生的事儿呢, 还是 仅仅是我的一个梦。 9月8日 正式玩Nutch & Lucene网上已经有很多参考的资料了,自己配制的时候还是遇到了一些问题。不过相对来讲,nutch爬虫还是比较傻瓜式的。:) 1。安装cygwin。 我先到网上下载了cygwin,在安装的时候遇到了些问题。ls命令不支持,系统提示缺少dll文件。“我的电脑”鼠标右键配制环境变量,将C:\cygwin\bin目录加入path环境变量,将cygwin运行需要的dll文件注册到系统。然后启动cygwin,ok! 2。测试nutch。 网上说得已经很明白了,呵呵。cygwin下 “cd /cygdrive/d/backup/nutch-0.7/bin”在bin目录下运行nutch命令 “sh nutch”。正常的话 将打印一大坨命令说明;不正常情况目前还没遇到,就不知道了哈; 3。运行爬虫: 网上说在nutch的根目录下建立urls文件,nutch0.7.1似乎不行,也许是有别的配制,我将urls文件放在nutch的bin目录下(此文件不要后缀)。爬虫程序才可以发现urls文件。这个文件名也是可以随便修改的。只要在运行爬虫的命令中指明正确的文件名就可以。在urls中添加 http://www.nutchchina.com 作为爬行的根目录;nutch/conf/crawl-urlfilter.txt中设置爬行规则 +^http://([a-z0-9]*\.)*nutchchina.com/ 配制完这些后,cygwin中cd到nutch的bin目录,运行以下命令 sh nutch crawl urls -dir crawl.nutchchina.com -depth 3 -threads 4 urls 就是存放爬行根目录的文件; crawl.nutchchina.com是我们指定的索引文件存放目录,爬虫在执行的时候,一些临时文件存放在这个目录; -depth 3 指定爬行深度; -threads 4 指定启动4个线程进行爬行; 等待。。。。一堆打印信息;因为www.nutchchina.com信息不多,所以很快就over了。一分钟不到;这个时候索引已经建立了。 4。配制tomcat进行查询 昨天wjjj网友说tomcat运行不了nutch。我不知道他具体的原因,下面是我的配置过程,一次就通过了; 将nutch目录下的nutch-0.7.war拷贝到tomcat的webapps目录下,解压缩。将原来tomcat的ROOT目录改名, 比如ROOT.bak。然后将nutch-0.7目录改为ROOT。然后到D:\Tomcat4\webapps\ROOT\WEB-INF\classes目录下 修改nutch-site.xml。 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="/nutch-conf.xsl"?> <!-- Put site-specific property overrides in this file. --> <nutch-conf> <property> <name>searcher.dir</name> <value>D:\backup\nutch-0.7\bin\crawl.nutchchina.com</value> </property> </nutch-conf> 指定索引目录为刚才爬虫建立的索引路径 crawl.nutchchina.com. 启动tomcat。访问 http://localhost:8080 应该看到查询页面了。 我感觉查询结果不是很满意,也许是因为网站基本都是动态页面,没有静态页面。很多查询结果都跳转到了首页。 论坛里的许多资料也没有爬行出来,不知道为什么。 前几天尝试着在nutch的基础类上进行一些测试,测试抓取功能,输入url,返回url的页面信息。没有通过,还要配置许多东西:(。如果不使用nutch提供的爬虫,在nutch的基类的基础上写爬虫,似乎要修改不少源代码。呵呵,以后再try again吧。 9月3日 Portishead
Portishead (IPA: /pɔːtɪsˈhɛd/) is a trip hop band from Bristol, England, named after the small town of Portishead, 12 miles west of Bristol. With their use of live jazz samples and intentionally lo-fi sound, the band has been cited as influential by many modern musicians including underground producer Danger Mouse [1] and videogame musician/producer Akira Yamaoka.
HistoryThe band was formed in 1991, by keyboardist/multi-instrumentalist Geoff Barrow and singer Beth Gibbons. Barrow had previously worked with two other trip hop bands from Bristol, Massive Attack and Tricky, and decided to name his new endeavor after his hometown. After releasing a short film (To Kill a Dead Man) and its accompanying music, Portishead signed a record deal with Go! Beat Records and their first album, Dummy, was released in 1994, and featured heavy contributions from guitarist Adrian Utley. In spite of the band's aversion to press coverage, the album was successful in both Europe and the United States, spawning two hit singles, "Glory Box" and "Sour Times". Portishead has often been used as accompanying music in the media. Such examples include car adverts, Channel 4 intermissions and the teenage drama series Sugar Rush. [citation needed] Their second album, Portishead, was released in 1997, and featured the single "All Mine". A live album featuring new orchestral arrangements of the group's songs was recorded primarily at Roseland in New York City, and released in 1998 with a DVD of the concert soon following. 1999 saw a cooperation with singer Tom Jones for a track on his album Reload. There were rumours of a third album to be published, possibly called Alien, but Portishead's official site dismissed the rumours: "We have noticed that there is some confusion on an album release called "Alien". Please be aware that this is NOT a Portishead release. The band are in the studio working on new material now but no release dates are scheduled as yet. Keep an eye on the site as any release plans will of course be announced here first!" [2] As of August 2006, no new album has been released. However in the summer of 2006 a new Portishead track, entitled "Requiem for Anna" (from the tribute album to Serge Gainsbourg entitled Monsieur Gainsbourg Revisited), began to circulate in the music blogosphere. In addition to this the band revealed to NME that they are struggling with their third album but are continuing to work hard on it. [citation needed] In February 2005 the band appeared live for the first time in seven years at the Tsunami Benefit Concert in Bristol. Around the same time Barrow revealed that the band was in the process of writing its third album, although nothing has been produced as yet. In January 2006 3D from Massive Attack confirmed that the two bands planned another joint concert later in the year. [citation needed] In 2005, Utley and Barrow produced The Coral's The Invisible Invasion. Barrow along with Utley, Clive Deamer and John Baggott also assisted with the production of Stephanie McKay's "McKay" album in 2003 under the Go! Beat Records label. Portishead and other so-called trip hop groups have expressed dislike for the term, arguing it is a media invention designed to categorise their otherwise not-so-categorise-able music. [citation needed] After rumours which had circulated for over a year, Portishead showed the first signs that their third album was progressing by posting two new tracks on their myspace page in August 2006. However these were subsequently dismissed by Geoff Barrow as 'doodles'[3]. |
|
|