dimsum16.github.io/index.html at master · dimsum16/dimsum16.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
<!DOCTYPE html>
<html lang="en-us">
  <head>
    <meta charset="UTF-8">
    <title>SemEval 2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM) by dimsum16</title>
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <link rel="stylesheet" type="text/css" href="stylesheets/normalize.css" media="screen">
    <link href='https://fonts.googleapis.com/css?family=Open+Sans:400,700' rel='stylesheet' type='text/css'>
    <link rel="stylesheet" type="text/css" href="stylesheets/stylesheet.css" media="screen">
    <link rel="stylesheet" type="text/css" href="stylesheets/github-light.css" media="screen">
  </head>
  <body>
    <section class="page-header">
      <h1 class="project-name">SemEval 2016 Task 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)</h1>
      <h2 class="project-tagline">Task Home Page</h2>
    </section>

    <section class="main-content">
      <p>The <strong>DiMSUM</strong> shared task at <a href="http://alt.qcri.org/semeval2016/">SemEval 2016</a> is concerned with predicting, given an English sentence, a broad-coverage representation of lexical semantics. The representation consists of two closely connected facets: a segmentation into <strong>minimal semantic units</strong>, and a labeling of some of those units with semantic classes known as <strong>supersenses</strong>.</p>

<p>For example, given the POS-tagged sentence</p>

<blockquote>
<p>I<sub><code>PRON</code></sub>  googled<sub><code>VERB</code></sub> restaurants<sub><code>NOUN</code></sub> in<sub><code>ADP</code></sub> the<sub><code>DET</code></sub> area<sub><code>NOUN</code></sub> and<sub><code>CONJ</code></sub> Fuji<sub><code>PROPN</code></sub> Sushi<sub><code>PROPN</code></sub> came<sub><code>VERB</code></sub> up<sub><code>ADV</code></sub> and<sub><code>CONJ</code></sub> reviews<sub><code>NOUN</code></sub> were<sub><code>VERB</code></sub> great<sub><code>ADJ</code></sub> so<sub><code>ADV</code></sub> I<sub><code>PRON</code></sub> made<sub><code>VERB</code></sub> a<sub><code>DET</code></sub> carry<sub><code>VERB</code></sub> out<sub><code>ADP</code></sub> order<sub><code>NOUN</code></sub></p>
</blockquote>

<p>the goal is to predict the representation</p>

<blockquote>
<p>I  googled<sub><code>v.communication</code></sub> restaurants<sub><code>GROUP</code></sub> in the  area<sub><code>n.location</code></sub> and  Fuji<code>_</code>Sushi<sub><code>n.group</code></sub> came<code>_</code>up<sub><code>v.communication</code></sub> and reviews<sub><code>n.communication</code></sub> were<sub><code>v.stative</code></sub> great so I  made<code>_</code> a  carry<code>_</code>out<sub><code>v.possession</code></sub> <code>_</code>order<sub><code>v.communication</code></sub></p>
</blockquote>

<p>Noun supersenses start with <code>n.</code>, verb supersenses with <code>v.</code>, and  <code>_</code> joins tokens within a multiword expression. (<em>carry</em><code>_</code><em>out</em><sub><code>v.possession</code></sub> and <em>made</em><code>_</code><em>order</em><sub><code>v.communication</code></sub> are separate MWEs.)</p>

<p>The two facets of the representation are discussed in greater detail below. Systems are expected to produce the both facets, though the manner in which they do this (e.g., pipeline vs. joint model) is up to you.</p>

<p>Gold standard training data labeled with the combined representation is provided in two domains: <strong>online reviews</strong> and <strong>tweets</strong>. (Rules for using other data resources in <a href="#data-conditions">data conditions</a>.) Blind test data will be in these two domains as well as a third, surprise domain. The domain of each sentence will not be indicated as part of the input at test time. The three test domains will have equal weight in the overall system scores (see <a href="#scoring">scoring procedure</a>).</p>

<h2>
<a id="minimal-semantic-units" class="anchor" href="#minimal-semantic-units" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Minimal semantic units</h2>

<p>The word tokens of the sentence are partitioned into basic units of lexical meaning. Equivalently, where multiple tokens function together as an idiomatic whole, they are grouped together into a <strong>multiword expression</strong> (MWE). MWEs include: nominal compounds like <em>hot dog</em>; verbal expressions like <em>do away with</em> 'eliminate', <em>make decisions</em> 'decide', <em>kick the bucket</em> 'die'; PP idioms like <em>at all</em> and <em>on the spot</em> 'without planning'; multiword prepositions/connectives like <em>in front of</em> and <em>due to</em>; multiword named entities; and many other kinds.</p>

<ul>
<li>Input word tokens are never subdivided.</li>
<li>Grouped tokens do not have to be contiguous; e.g., verb-particle constructions are annotated whether they are contiguous (<em>make up</em> the story) or gappy (<em>make</em> the story <em>up</em>). There are, however, formal constraints on gaps to facilitate sequence tagging.</li>
<li>Combinations considered to be statistical collocations (yet compositional in meaning) are called "weak MWEs", distinguished from MWEs with idiomatic meanings ("strong MWEs"). Otherwise, different categories of MWEs are not explicitly annotated.</li>
</ul>

<p>Refer to <a href="http://www.cs.cmu.edu/%7Enschneid/mwecorpus.pdf">this LREC 2014 paper</a> for details of the MWE annotation.</p>

<h2>
<a id="semantic-classes" class="anchor" href="#semantic-classes" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Semantic classes</h2>

<p>These are broad-coverage lexical categories known as "supersenses".</p>

<ul>
<li>There are 26 noun supersenses, including <code>n.person</code>, <code>n.location</code>, <code>n.time</code>, <code>n.food</code>, and <code>n.communication</code>. They cover common nouns as well as proper names (named entities).</li>
<li>There are 15 verb supersenses, including <code>v.motion</code>, <code>v.social</code>, and <code>v.communication</code>.</li>
<li>Supersense annotations always respect strong MWE annotations: the supersense class applies to the entire MWE as a unit. Of MWEs, all and only the ones that holistically function as a noun or verb expression are labeled with a supersense.</li>
<li>Single-word noun and verb tokens also receive supersenses. E.g., instances of <em>hot dog</em> and <em>hamburger</em> would both receive the <code>n.food</code> label.</li>
</ul>

<p>Refer to <a href="http://www.cs.cmu.edu/%7Enschneid/sst.pdf">this NAACL 2015 paper</a> for details of the annotation of supersenses on top of MWEs.</p>

<h2>
<a id="data-conditions" class="anchor" href="#data-conditions" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Data conditions</h2>

<p>There will be three conditions according to which systems will be compared.</p>

<p>To facilitate a controlled comparison of algorithms, in the <strong>(semi-)supervised closed</strong> conditions, systems may only use specific data resources. </p>

<ul>
<li>In the <strong>supervised closed</strong> condition, the following are permitted:

<ul>
<li>the labeled training data</li>
<li>the <a href="http://wordnet.princeton.edu/">English WordNet</a> lexicon</li>
<li>the following sets of word clusters (Brown clusters):

<ul>
<li>yelpac-c1000-m25.gz from the <a href="http://www.cs.cmu.edu/%7Eark/LexSem/">English Multiword Expression Lexicons</a>—this clustering was <a href="http://www.cs.cmu.edu/%7Eark/LexSem/mwelex/README.md">induced</a> from the Yelp Academic Dataset; and/or</li>
<li>any of the <a href="http://www.cs.cmu.edu/%7Eark/TweetNLP/#resources">ARK Tweet NLP clusters</a>
</li>
</ul>
</li>
</ul>
</li>
<li>In the <strong>semi-supervised closed</strong> condition: all of the above are permitted, plus the <a href="https://www.yelp.com/academic_dataset">Yelp Academic Dataset</a>.</li>
<li>In the <strong>open</strong> condition, systems may use any and all available resources.</li>
</ul>

<p>System submissions will specify which of these datasets were used, and this will determine which competition(s) it is entered into. Each team is allowed to submit up to 3 systems—one in the supervised closed condition, one in the semi-supervised closed condition, and one in the open condition.</p>

<p>A new test set has been annotated for this task. A blind version (input only) has been released in advance of the evaluation period.</p>

<h2>
<a id="scoring" class="anchor" href="#scoring" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Scoring</h2>

<p>Our evaluation script, <a href="https://github.com/dimsum16/dimsum-data/blob/master/scripts/dimsumeval.py">dimsumeval.py</a>, is bundled with the latest data release. Primarily, it reports F-scores for MWE identification, supersense labeling, and their combination. See the documentation in the script for an overview. Further details of the scoring procedure will be announced at a future time.</p>

<h2>
<a id="downloads" class="anchor" href="#downloads" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Downloads</h2>

<ul>
<li>
<strong><a href="https://github.com/dimsum16/dimsum-data/releases/tag/1.5">Training/test data + scripts v1.5</a></strong>

<ul>
<li><a href="https://github.com/dimsum16/dimsum-data/blob/1.5/README.md">README</a></li>
<li><a href="https://github.com/dimsum16/dimsum-data/blob/1.5/TAGSET.md">TAGSET</a></li>
</ul>
</li>
<li>
<strong>Trial data</strong>: Download STREUSLE 2.0 <a href="http://www.ark.cs.cmu.edu/LexSem/">here</a>. This consists of annotated online reviews (it will eventually form part of the training set for the task).

<ul>
<li>Refer to the files streusle.tags and streusle.tags.sst (which contain equivalent information, but in different formats). The formats are described in README.md.</li>
</ul>
</li>
<li>A <strong>baseline system</strong> will be provided as well.</li>
</ul>

<h2>
<a id="system-submission" class="anchor" href="#system-submission" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>System submission</h2>

<p>The deadline for system outputs to be submitted is <strong>Jan. 31, 2016</strong>. Instructions:</p>

<ol>
<li>Create a .zip file containing:

<ul>
<li>
<strong>dimsum16.test.pred</strong>, your system's output. Each line should correspond to a line in dimsum16.test.blind, with values filled into the appropriate columns. You should run the evaluation script (with a dummy gold standard) to ensure your system output is formatted correctly.</li>
<li>
<strong><a href="https://raw.githubusercontent.com/dimsum16/dimsum-data/master/submission.csv">submission.csv</a></strong>, but filled out with a specification of the resources used for the submission.</li>
</ul>
</li>
<li>Create a <a href="https://www.softconf.com/naacl2016/SemEval2016/user/">new START submission</a> (requires an account). Provide the team name and members, select Task 10, and write a short blurb about your system. Upload the zip file.</li>
<li>After creating your submission, you may update it (both metadata and zip file) up until the deadline. Only the last file upload for that submission will be retained.</li>
<li>Each team may enter up to 3 submissions, one per data condition. Specify the same team name for all of them, and ensure the submission.csv is different for the different conditions. If multiple submissions from the same team fall under the same data condition, the last one will be used for the evaluation.</li>
</ol>

<p>We plan to make all submissions—including the test set predictions—public at some point subsequent to the task. (If there are institutional barriers to this, please contact the organizers.)</p>

<h2>
<a id="organization" class="anchor" href="#organization" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Organization</h2>

<p>Please subscribe to <a href="https://groups.google.com/group/dimsum16">https://groups.google.com/group/dimsum16</a> for announcements about the task.</p>

<p>See the <a href="http://alt.qcri.org/semeval2016/task10/index.php?id=important-dates">schedule</a> for planned data releases and deadlines.</p>

<p>The organizers are:</p>

<ul>
<li>
<a href="http://nathan.cl">Nathan Schneider</a>, University of Edinburgh</li>
<li>
<a href="http://dirkhovy.com/">Dirk Hovy</a>, University of Copenhagen</li>
<li>
<a href="http://www.johannsen.com/">Anders Johannsen</a>, University of Copenhagen</li>
<li>
<a href="http://marinecarpuat.weebly.com/">Marine Carpuat</a>, University of Maryland</li>
</ul>

      <footer class="site-footer">

        <span class="site-footer-credits">This page was generated by <a href="https://pages.github.com">GitHub Pages</a> using the <a href="https://github.com/jasonlong/cayman-theme">Cayman theme</a> by <a href="https://twitter.com/jasonlong">Jason Long</a>.</span>
      </footer>

    </section>


  </body>
</html>