Tuesday, June 24, 2014

#MongoDBWorld Learning about @LinkedIn Learning with @MongoDB

More from #MongoDBWorld.

LinkedIn uses MongoDB to quickly create APIs.

“What’s your next play?”

LinkedIn developed “LearnIn” “Transform yourself through Learning”

Find resources when you need it.

Want employees to search across different platforms. eg. Safari books, training course, internal wiki etc.

Also wanted to create a living breathing transformation plan – used for personal development.

What technologies to use?

Primarily a javascript team.

What server and database to use?

How to build it quickly?

Server selection

No engineering support. So as a javascript team they set out to use Node.js.

  • Server-side Javascript
  • Lightweight
  • NPM Package Support
  • Extensive support in the community
  • Allows for data-driven JSON

Which database?

The Needs:
- support 5,000 employees, globally
- Minimal data storage needs
- More reads than writes
- Scale with company growth
- Flexible

In past year company has grown from 3,000 to 6,000+ – Hyper growth

The database choice was MongoDB because:

  • Easy to setup NoSQL Joson
  • Node.js driver
  • Advanced queries
  • Flexible Schema – schema-less
  • Extensive documentation
  • High single instance threshold (single instance can hold Terabytes of data)

Taking schema-less approach to MongoDB you can break API/Application logic.

They went to Mongoose – An Object Document mapper. Mongoosejs.com

  • Easy type casting
  • Quick setup
  • Easy Document modeling
  • field validation API
  • Business Logic hooks, Custom Middleware
  • Mongo_id reference population

Mongoose forces I collection per data type

This helped with:
- Normalized data modeling
- Many-to-many relationships
- Collection per Data Type

How do we search?

How to search on documents via text search.

MongoDB provides full-text search (beta in v2.4)

  • Easy searching via a MongoDB index
  • Relevancy Searching
  • Stemming and multi-language support.

Some limitations:
- Single token/analyzer.
- Simple relevancy scoring but not Lucene
- No completion suggestion
- No fuzzy matching
- No related item search.

What was needed:
- Lucene index with relevancy scoring and performance
- Custom field analyzer for tokenization and stemming
- “Related to” querying or “More Like This”
- Quick completion suggesiton
- Complicated wildcard search
- Easy Node.js integration

Went to ElasticSearch (elasticsearch.org)

Getting ElasticSearch to work with MongoDB.

River takes a stream of data and indexes in realtime.

River plugs in to the MongoDB Oplog. and creates elasticsearch index using same opLog that is used to create MongoDB Replicas.

elasticsearch.js can plug in to node.js api.

The limitations of elasticsearch:

  • keeping index clean requires data clean up
  • Adds more technology to the stack


What IDE does the team work?

  • Webstorm

Do you lose flexibility with Mongoose?

  • Not really. It does make you think about data structures but you can easily change them.

Jacob Dejno @dejno – Web Developer
Ryan Seamons @ryanseamons – Product Manager LearnIn

[tag health cloud BigData MongoDB MongoDBWorld NoSQL]

Mark Scrimshire
Health & Cloud Technology Consultant

Mark is available for challenging assignments at the intersection of Health and Technology using Big Data, Mobile and Cloud Technologies. If you need help to move, or create, your health applications in the cloud let’s talk.
Blog: http://2.healthca.mp/1b61Q7M
email: mark
Stay up-to-date: Twitter @ekivemark
Disclosure: I began as a Patient Engagement Advisor and am now CTO to Personiform, Inc. and their Medyear.com platform. Medyear is a powerful free tool that helps you collect, organize and securely share health information, however you want. Manage your own health records today. Medyear: The Power Grid for your Health.

via WordPress http://2.healthca.mp/1v26n2l