Not your parent, but I've recently toyed with TiddlyWiki, which shows promise, but requires heavy configuration.
There is also an extension to it called TiddlyMap which displays several of the properties mentioned in this article (edges with properties, etc), but again, requires configuration to get just so.
If you're game to do some tinkering, I've found it to be hackable to some very deep levels. Another nicety is that it's all just a single HTML file, so it's madly portable (I can use the same site on my phone and laptop).
All this being said, there is a growing list of features that I would like to see in Tiddlywiki that I'm not sure I can hack in myself, so I suppose I, too, am looking for the "one true knowledge management" solution
Just start with Apache Jena, standards based with RDF as the exchange format and SPARQL for the query language. Others solutions may use proprietary stuff for better vendor lock-in: This is completely up to you if you want that. But with Apache Jena you can change later to other KG databases. Also: Apache Jena is easy to work with, since it includes Fuseki to start directly using it as a web API.
I've been working on building a personal knowledge database tool recently, feel free to shoot me an email at antimatter15@gmail.com if you'd like to be one of the first to try it out.
I found your product this week when searching for a neo4j visualization tool but I couldn’t try it on anything other than an example database. Is there anyway to try/use it as a researcher?
Hi, I'm Gephi+Linkurious co-founder. I've found visualizing large graphs pretty useless beyond the "I see meatballs!" effect and my opinion, after a decade in the field, is that it's the wrong problem for data analytics.
Much more interesting information is discovered during the process of dynamically building a visualization that is focused on user questions. I see with Linkurious that investigators usually need to visualize less than 1000 edges of a 1M+ edges graph to get answers.
The ultimate answer is generally a small graph: Graphistry is a tool that helps you get there. Why that's hard is most Splunk, Spark, etc. queries will return a bunch of events, and each event has a bunch of metadata. A tool should help, not fall over.
I think you're referring to scenarios closer to why we created the visual playbook concept and our embedding APIs. Small visualizations are often a good starting point in investigative scenarios. Even better.. no visualization, just full automation. We find this thinking comes up when the investigative flow is more established and curated. With visual playbooks, teams can record & automate multistep flows, run them whenever an incident happens, take action, and share & document the results. If part of the incident involves a bunch of events, or the analysts wants to dig in, our stack won't fall over. Instead, it provides a full visual analytics session with multiple cross-linked data views.
And we're fans of Gephi. We GPU accelerated the core algorithm -- we may be coming from a different perspective and user base.
The conclusion is misleading due to 2 wrong assumptions:
1. The population is heterogeneous: interviews test different skills. All interviews don't test the same set of skills, which is mandatory to compare interview scores because scores are aggregates of these skill tests. Different job opportunities means different skills to test, so it seems reasonable to assume that people evaluation vary for different job opportunities, and thus their scores vary for different interviews.
2. The observations are not statistically independent: past interviews may influence future interviews. People may get better at passing interviews or conducting interviews over time. This would impact their score. It would be good to study the evolution of individual scores over time.
While (1) should strongly limit the conclusions of the study, the complete analysis may simply be irrelevant because of (2) if the statistical independence of observations is not demonstrated. Sorry guys but this is Statistics 101 introductory course.
(1) We listened to most interviews on the platform to establish homogeneity. Interviews were across the board, language agnostic, and primarily algorithmic in nature.
(2) We actually looked into this and noticed that time didn't really affect performance. Usually, people did their interviews over a pretty short time span and then found a job. Or, people were already experienced interviewers and had kind of hit a plateau. You can see the raw data and how it oscillates wrt time in the footnotes.
Technical side: we recommend to display up to 2000 nodes and edges. Laptops < 2 year old can display and layout graphs up to 4000 nodes and edges but with stability issues.
Cognitive side: we recommend to hide nodes and edges as soon as you don't need them. One cannot ask the same class of questions to graph visualizations of very different sizes, see slide 19 on http://www.slideshare.net/Cloud/sp1-exploratory-network-anal...
The biggest dataset used by one of our users is a genetic graph of 240 millions of nodes and edges using a single server. Linkurious will take a few hours to index the complete dataset. From then, the search engine delivers instant results with autocomplete, fuzziness and advanced query options; graph exploration queries take less than a second to complete (sometimes a bit more depending on the web client). We are still working on improving our data indexing strategy to gain performance.
Synerscope has a strong approach to data analysis, and Danny Holten is well-known in the infovis community. I don't think that they provide a search engine thought, you have probably more information on their product.
Disclamer: Linkurious CEO here, the tool used to explore the Neo4j graph database used at NASA.