Use clucore #995

zupon · 2021-03-29T03:42:55Z

This PR addresses #991.

So far, this PR...

Adds CluCoreProcesor to EidosEnglishProcessor
Removes EidosCluProcessor
Removes CLU language from Language.scala and NegationHandler.scala

After these first changes, I tried running sbt test, but most of them aborted and I got an error about how I need to initialize dynet first:

 Attempting to define parameters before initializing DyNet. Be sure to call dynet::initialize() before defining your model.

There's also a good chance that I just plugged the CluCoreProcessor in the wrong place.

…anguage

kwalcock · 2021-03-31T15:58:18Z

If the addition of org.clulab.dynet.Utils.initializeDyNet didn't help, then holler. Sorry it is buried so deeply. It's part of processors.

zupon · 2021-04-04T19:57:22Z

It does seem to work now. Thanks! I might not have placed the initialization call in the most ideal place, but we can move it around if needed.

I also reran sbt test in both this branch and master to compare which tests are failing.

Here's the end result of running sbt test in my local master branch:

[info] Run completed in 38 minutes, 12 seconds.
[info] Total number of tests run: 654
[info] Suites: completed 81, aborted 0
[info] Tests: succeeded 612, failed 42, canceled 7, ignored 254, pending 0
[info] *** 42 TESTS FAILED ***
[error] Failed tests:
[error]         org.clulab.wm.eidos.system.TestCrLf
[error]         org.clulab.wm.eidos.groundings.TestVersioner
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc3

And here's the same output in the use-clucore branch:

[info] Run completed in 24 minutes, 39 seconds.
[info] Total number of tests run: 654
[info] Suites: completed 81, aborted 0
[info] Tests: succeeded 498, failed 156, canceled 7, ignored 254, pending 0
[info] *** 156 TESTS FAILED ***
[error] Failed tests:
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc5
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDSerializer
[error]         org.clulab.wm.eidos.text.english.raps.TestRaps1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc8
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP1
[error]         org.clulab.wm.eidos.text.english.cag.TestExtraText
[error]         org.clulab.wm.eidos.serialization.TestDocSerialization
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc2
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP4
[error]         org.clulab.wm.eidos.system.TestCrLf
[error]         org.clulab.wm.eidos.groundings.TestVersioner
[error]         org.clulab.wm.eidos.serialization.jsonld.TestJLDDeserializer
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc6
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP3
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc1
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc4
[error]         org.clulab.wm.eidos.system.TestEidosMention
[error]         org.clulab.wm.eidos.text.english.cag.TestCagP2
[error]         org.clulab.wm.eidos.utils.TestMentionUtils
[error]         org.clulab.wm.eidos.document.TestSentenceClassifier
[error]         org.clulab.wm.eidos.text.english.eval6.TestDoc7

At least for master, I'm pretty sure the TestCrLf failure is just because I have some local resources that are included in the test, but aren't actually pushed to remote. Not sure about TestVersioner or TestDoc3 though. I also don't know what the 7 cancelled tests are, but I assume (for now) that they are the same 7 in both branches, so no difference there.

kwalcock · 2021-04-04T20:06:50Z

You don't need to worry about TestCrLf and TestVersioner and probably not TestDoc3. TestCrLf is probably because you are using Windows, TestVersioner is maybe because the git plugin can't version something locally (is git available on the command line?), and I think that TestDoc3 might be the unstable one. I'll definitely check before it becomes critical.

MihaiSurdeanu · 2021-04-04T20:42:25Z

Thanks! Can you please do an eidos output diff on the tests on the actual docs?

…

On Sun, Apr 4, 2021 at 13:07 Keith Alcock ***@***.***> wrote: You don't need to worry about TestCrLf and TestVersioner and probably not TestDoc3. TestCrLf is probably because you are using Windows, TestVersioner is maybe because the git plugin can't version something locally (is git available on the command line?), and I think that TestDoc3 might be the unstable one. I'll definitely check before it becomes critical. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#995 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI75TXW4FQYIHQMDHRDJLDTHDBGNANCNFSM4Z6YYKGQ> .

zupon · 2021-04-04T21:10:46Z

I'm not completely sure what you mean. Do you just mean to get the specific outputs of each failing test so we can see which parts are failing? E.g. for the TestRaps test, the output that includes this:

...
[info] Raps_sent8
[info] - should have correct node
[info] Raps_sent9
[info] - should have correct edge !!! IGNORED !!!
[info] Raps_sent10
[info] - should have correct edge 1 !!! IGNORED !!!
[info] - should have correct edge 2
[info] - should have correct edge 3 !!! IGNORED !!!
[info] Raps_sent11
[info] - should have correct node 2 *** FAILED ***
[info]   List("
[info]   Errors:
[info]          Could not find NodeSpec [Use of improved cultivars and mechanization|+POS(improved)]
...

MihaiSurdeanu · 2021-04-04T23:04:04Z

I'd like to see which relations are missing with CluCore. Then, the parse trees + NE labels with the old processors and CluCore for those sentences. I suspect this is something major, so a couple of relevant sentences should be sufficient to get us started. Thanks!

…

On Sun, Apr 4, 2021 at 2:10 PM zupon ***@***.***> wrote: I'm not completely sure what you mean. Do you just mean to get the specific outputs of each failing test so we can see which parts are failing? E.g. for the TestRaps test, the output that includes this: ... [info] Raps_sent8 [info] - should have correct node [info] Raps_sent9 [info] - should have correct edge !!! IGNORED !!! [info] Raps_sent10 [info] - should have correct edge 1 !!! IGNORED !!! [info] - should have correct edge 2 [info] - should have correct edge 3 !!! IGNORED !!! [info] Raps_sent11 [info] - should have correct node 2 *** FAILED *** [info] List(" [info] Errors: [info] Could not find NodeSpec [Use of improved cultivars and mechanization|+POS(improved)] ... — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#995 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI75TS75GA657IZCYHLJT3THDIWDANCNFSM4Z6YYKGQ> .

zupon · 2021-04-05T17:27:18Z

Ok I think I understand. The CluCoreProcessor uses the new corenlp relations (e.g. obj instead of dobj) for some things, correct? Changing to CluCoreProcessor is likely to break some tests that used the old relations. Now, we want to find out which relations are missing, and compare the parse trees and SRLs for those sentences that are different. Does that sound accurate?

Is there a specific test or a particular document/text to check all the relations?

BeckySharp · 2021-06-08T16:41:58Z

@zupon what's the status of this?

zupon · 2021-06-08T17:13:44Z

@BeckySharp There have been some other updates that haven't made it into this PR yet. Mihai adjusted some rules for expanding on contractions based on an error analysis I did. Now I am still in the process of looking at the failing tests when using CluCore and comparing those with the outputs from master using FastNLP.

There haven't been any more updates to this branch yet though, and I'm not sure when it'll be ready. If you want we can close this PR for now (without merging) until this is further along.

changes EidosEnglishProcessor to use ClueCoreProcessor, removes CLU l…

e344fbc

…anguage

zupon requested review from BeckySharp and kwalcock March 29, 2021 03:43

zupon added 2 commits April 4, 2021 12:05

initializes dynet for CluCoreProcessor

8f10403

changes to EidosEnglishProcessor in DebugGroundingExporter

fd3aa1a

merged master into use-clucore branch

c5e6f1a

fix some tests for CluCore

7d649a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use clucore #995

Use clucore #995

zupon commented Mar 29, 2021 •

edited

Loading

kwalcock commented Mar 31, 2021

zupon commented Apr 4, 2021

kwalcock commented Apr 4, 2021

MihaiSurdeanu commented Apr 4, 2021 via email

zupon commented Apr 4, 2021

MihaiSurdeanu commented Apr 4, 2021 via email

zupon commented Apr 5, 2021

BeckySharp commented Jun 8, 2021

zupon commented Jun 8, 2021

Use clucore #995

Are you sure you want to change the base?

Use clucore #995

Conversation

zupon commented Mar 29, 2021 • edited Loading

kwalcock commented Mar 31, 2021

zupon commented Apr 4, 2021

kwalcock commented Apr 4, 2021

MihaiSurdeanu commented Apr 4, 2021 via email

zupon commented Apr 4, 2021

MihaiSurdeanu commented Apr 4, 2021 via email

zupon commented Apr 5, 2021

BeckySharp commented Jun 8, 2021

zupon commented Jun 8, 2021

zupon commented Mar 29, 2021 •

edited

Loading