Microsoft Should Fix These Nagging SharePoint Issues Before Pushing Ahead With Fancy New Features

We had an interesting twitter conversation yesterday about features we as SharePoint consultants wanted in the next version of SharePoint.  We had lots of interesting discussions on the future of Yammer, Oslo, Power BI, etc.

However, there are a number of issues with SharePoint that as it matures have still never been addressed and in some cases are getting worse as we move to the cloud, integrate Yammer, etc. 

This is my list of BASIC issues that continue to stump my customers consistently…

  • Fix scalability so that SharePoint can truly handle massive volumes of documents (e.g. hundreds of millions of documents).  The basics are there but things like limitations on bulk updating, view limits, searching limits, etc. start impacting this without a lot of advanced configuration.
  • Version control and Content Types are not ADVANCED FEATURE!  Turn these on by default, move the settings up to the main menu and have a more guided experience for both of them.
  • Remove the dependency on the URL as a unique identifier for a resource.  It’s simply too fragile as site moves around, links are made from external systems, etc.
  • Provide a built in 404 location service that automatically maps old links to new ones, old sites to new locations, etc. in an intelligent manner.  Have a linking service that allows me to do bulk update URL locations, remap items, and bulk move content around that doesn’t break existing links.
  • Content Type and other Taxonomy concepts should be centrally governable.  For example, if I have design level permissions, I can take my company prescribed document library template and content types and simply remove the content types, add my own columns, etc.  I can easily break all the hard work the IM team has done to establish appropriate taxonomy.  Make this easier to publish out and govern.
  • Fix navigation so that it is global and integrates all site collections into a single navigation structure. 
  • Remove the distinction between “Publishing Site” and “Collaboration Site” – any site should allow for branding, creating pages, etc. and all the features in collaboration sites should work well with a branded experience.Remove all the artificial default boundaries caused by Site Collections such as content types not being global, navigation not spanning site collections, features being Site Collection scoped, etc.   Fix mass uploading and bulk updating of SharePoint document library items.

This isn’t an exhaustive list but in many cases these are issues that have been around for many versions of SharePoint and should have been fixed in 2010 or 2013…before we launch a bunch of new fancy features, these are basic ones that impact adoption of the platform with many of our clients.

Read More

SharePoint Teams Should Focus On The Basics

On Twitter, there are a number of very smart SharePoint consultants who spend a lot of time thinking about the latest innovations from Microsoft, how we can leverage the SharePoint platform in weird and wonderful ways and how our customers could be super productive if they just used Yammer, Oslo, Upgraded to SharePoint 2013, Migrated to Office 365, branded their sites using JavaScript frameworks, etc.  

All these are great ideas but they miss a key challenge with many of our clients –we’re promoting features to our clients that have minimal value to them when they cannot even upload a document to the right spot or find anything.

When we talk to our customers and do workshops with them, we routinely find the #1 challenge is the simplest; “I CANNOT FIND ANYTHING”.  It isn’t, “oh I want to be able to combine by YamJams and my co-authoring and my enterprise business intelligence with my twitter analytics and I need it all to look like our corporate brand”.  Instead, it is “WE ARE GOING TO GET SUED SOMEDAY AND I’M SCARED THE CONTRACT HAS GONE MISSING”. 

AIIM did a survey and they published two statistics that I use continually in my discussions on SharePoint with clients:

image

image

 

We have many clients who cannot succeed because they haven’t established the basics.  While Microsoft is actively selling its latest business intelligence features, yammer integration, etc. we have clients who still after multiple versions of SharePoint cannot figure out how to set up a document library, how to build a workflow or publish a page.  Many of our clients want to do the right thing but haven’t been given the process tools, organizational strategies or information management strategies to answer the basic questions that SharePoint requires like, WHERE SHOULD I UPLOAD MY DOCUMENT? 

Our Advisory Services team has a simple approach – invest more dollars in training, governance planning and information management strategy, less dollars in fancy features, branding and custom applications.  Align your SharePoint deployment to solving really hard but basic information management problems and ensure that you have these really effective before investing in additional features. 

Read More

Improving Performance of SQL Server Running in Azure VMs

As previously posted by my work colleague, Sergei Mizrokhi, there are some significant differences in performance when moving to a Azure VM from a local SQL Server:

As you can see by these tests, Azure VMs using either BLOB storage or virtual desk attached storage have significantly less throughput compared to a local implementation.

Microsoft has just released some new guidance on performance tuning for running SQL Server running in a virtual machine on Azure:

Over the past few months we noticed some of our customers struggling with optimizing performance when running SQL Server in a Microsoft Azure Virtual Machine, specifically around the topic of I/O Performance.

We researched this problem further, did a bunch of testing, and discussed the topic at length among several of us in CSS, the SQL Server Product team, the Azure Customer Advisory Team (CAT), and the Azure Storage team.

You can find the performance checklist here.  Microsoft also has a document published called Performance Guidelines for SQL Server in Azure Virtual Machines that you might find helpful.

There is also more general documentation for running SQL Server on Azure here.

Read More

Microsoft Unveils its Office 365 Public Roadmap and First Release Program

One of the key challenges with Office 365 is keeping up with the numerous updates coming down from Microsoft.  As previously posted, there have been some negative experiences as new features break customizations, impact performance or require configuration changes to SharePoint.

Microsoft today announced their new Office 365 Public Roadmap.  The roadmap provides administrators visibility on new features that will be released in the future so they can start preparing for them before they are deployed to their Office 365 instance.

improvingvisibility_01

In addition, Office 365 Administrators can obtain early access to releases.  This is highly useful for testing purposes – some customers are creating test instances of Office 365 with a small group of test users simply for testing these new features before they are available in their main production instance. 

First Release group in Admin portal

Read More

Democratizing Big Data–Microsoft Unveils Machine Learning Cloud Service

Big Data is a huge trend in many different industries and the use of Machine Learning to analyze and predict trends is one of the key uses of such techniques.  The gap for many organizations is that the skills required to generate meaningful analysis requires the use of specialized experts such as Data Scientists, Hadoop experts and business intelligence specialists.

While cloud based Hadoop has been available for some time, it has been presented as a raw service targeted to IT professionals and software engineers.  Harnessing machine learning approaches and algorithms has required a significant investment in both data science expertise to choose the appropriate algorithms and software engineering to implement them. 

Today, Microsoft has introduced the new Microsoft Azure Machine Learning language which will enable customers, partners and businesses to build data driven applications. 

Introducing Microsoft Azure Machine Learning

 

Azure ML, which previews next month, will bring together the capabilities of new analytics tools, powerful algorithms developed for Microsoft products like Xbox and Bing, and years of machine learning experience into one simple and easy-to-use cloud service. For customers, this means virtually none of the startup costs associated with authoring, developing and scaling machine learning solutions. Visual workflows and startup templates will make common machine learning tasks simple and easy. And the ability to publish APIs and Web services in minutes and collaborate with others will quickly turn analytic assets into enterprise-grade production cloud services.

Azure ML includes a design tool for harnessing enterprise grade algorithms for processing data and a software development kit for integrating them into custom developed applications. 

azureMLstudio

The product is in confidential preview currently and will be available as a public preview in a month.

Read More

Exploring The Evolving Real Time Web using Signal-R

In the traditional web, a web page is inherently NOT in real time.  When you request a web page, it sends a request to the web site and sends back the results.  Once your page is displayed, its now fundamentally out of date until you hit the refresh button and reload the page. 

As front end technologies such as HTML 5, Javascript, JQuery, etc. become more sophisticated, web developers have started enabling a more real time experience that supports sharing, collaborating and interacting in real time with both the server providing information and other clients. 

A simple example is Twitter – if you leave your twitter page up in your browser, you’ll periodically see these messages in real time as new tweets arrive:

image

Why is the Real Time Experience an important evolution from traditional web metaphors? 

  • Discussion boards become chat rooms
  • Document libraries become co-authoring
  • Reports become interactive dashboards
  • Activity walls become real time activity feeds
  • Notifications after the fact become real time alerts
  • Web “pages” become applications
  • Pictures become real time animations

As we add multiple devices all working together to respond to events, having each one having to poll the server to receive updated information becomes a broken metaphor – we need to be able to PUSH messages out to clients instead.  The server becomes a message broker instead of a message generator as clients send each other messages. 

Microsoft has developed a framework for developing real time web applications called Signal-R.  Version 2.0 of the framework has been out since the fall of 2013 and there is an excellent tutorial site here.  It provides a framework for allowing for continual remote function calls between clients and servers using .NET and JavaScript.

Invoking methods with SignalR

The connection between client and server is persistent (e.g. real time) and allows you as a developer to build applications that can push messages to a set of connected clients.  The protocols used to push these messages vary depending on the browser and the framework will automatically upgrade or downgrade the protocol based on what the browser can support. 

The other key component to making the real time web work are JavaScript frameworks that can update the screen based on the incoming information pushed from the server.  Frameworks such as JQuery, JQuery.UI, etc. 

There are lots of JavaScript libraries that could be used as front end rendering layers for presenting data in real time.  Here are a few examples:

  • Heatmap.js provides rendering of heatmaps based on incoming data.  Imagine rendering a heatmap based on a group of people’s clicks, mouse movements, or other signals in real time?
  • Data-Driven Documents: provides rendering of datasets – imagine these beautiful renderings being updated in real time based on data pushed from the server?
  • Cubism.JS: provides rendering of time series in real time.
  • Raphael.JS: provides rendering of vector graphics using simple JavaScript.  This could be a very nice animation library for use in real time massive collaborative applications.
  • Paper.js: provides a full vector graphics scripting framework based on HTML 5 canvas.

Here are some good examples of real time web applications that could be enabled using Signal-R and other real time web frameworks.

JABBR

  • Real time chat application using Signal-R as the framework. 

image

Office Web Apps

The new Office Web App supports co-authoring of documents in real time.

Murally

  • Group collaboration using virtual sticky notes, images and activities in real time.

LucidChart

  • Group collaboration on development of flowcharts
Example of Lucid Chart

Read More

Comparing Amazon and Microsoft Azure Hadoop Cluster Pricing

Hadoop is offered by both Amazon and Microsoft through their cloud services.  I spent some time today working through a comparison of a reasonably sized Hadoop cluster and here is the comparison in pricing per month.

Specifications

The specifications I used were:

  • 1 Head Node running a Extra Large (A4) instance on Azure or the roughly equivalent C3.2xLarge on Amazon.  Both of these VMs have 8 cores and 14 GB RAM (Amazon’s has 15 GB RAM)
  • 10 Data Nodes running a Large (A3) instance on Azure or the roughly equivalent C3.xlarge on Amazon.  Both of these VMs have 4 cores and 7 GB of RAM (Amazon’s has 14.5 GB RAM)
  • 50 TB of blob storage per month. I used Azure Local Redundant Storage as Amazon S3 storage standard is not geo redundant from what I can see.
  • 10 TB a month of inbound data transfers per month.7
  • 5 TB a month of outbound data transfers per month.
  • 10 Million Transactions per month.  Each record you put into storage is considered a transaction.

The Results

Here are the results based on Azure and Amazon’s latest pricing (keep in mind these change quite often):

Price Per Day for VMs

 
Component Quantity Azure Amazon
Head Node 1 $16.32 $20.57
Data Node 1 $81.60 $102.96

Price Per Month for Storage

Component Quantity Azure Amazon
Storage 50 TB $1244.16 $1510.40
Inbound Data Transfers 10 TB Free Free
Outbound Data Transfers 5 TB $665.60 $614.40
Transactions 10 million $0.60 $50.00

 

In both scenarios, you can create yourself a 10 node Hadoop cluster for ~$3000 a month.  Amazon charges more for their VMs and in particular with Hadoop they have a secondary charge in addition to their VMs for their Elastic Map Reduce service.  They also curiously charge significantly more for transactions (e.g. PUTS into storage).

Storage pricing depends on whether you choose locally redundant pricing or geo redundant pricing for Azure BLOB storage.  If you use geo redundant pricing, the storage cost goes from $1,244.16 per month to $2,488.32 per month for the same 50 TB of storage.

Microsoft also provides a secondary head node free of charge to increase the availability of the service.

Amazon supports Hadoop 2.4 in production – Microsoft only has it currently in preview.  Both support Hadoop 2.2 in production currently.

The Advantages of HDInsight in an Existing Microsoft Environment

HDInsight provides additional libraries that are designed to allow for better integration between Hadoop and other Microsoft technologies.  These include:

  • Powershell scripts and cmdlets for automation of Hadoop cluster deployments
  • Avro Library provides data serialization across languages for processing of complex data structures using C, C++, C#, Java, PHP, etc.
  • Integration with Excel through Power Query
  • HIVE ODBC Driver for querying data from Windows, SQL Server, .NET, etc.

If you are already used to Microsoft technologies and are running in a Microsoft environment, then these additional features provide an easier road to Hadoop as you integrate it with traditional SQL, Excel, SharePoint, etc.

Read More

Microsoft Now Supports Hadoop 2.4 in Preview

Microsoft now supports the latest version of Hadoop (version 2.4) through its HDInsight service in preview.  Hadoop version 2.2 is the default production release currently.

There are several new features in both 2.2 and 2.4 releases that dramatically improve use of Hadoop in general as well as some Microsoft specific libraries that improve performance dramatically over previous versions (Microsoft is claiming up to 100x!).

Key to these new versions of Hadoop is the new YARN framework which replaces MapReduce as the data processing framework.  YARN creates new models for querying, searching and processing large volumes of data.

hadoop24

Read More