BID DATA – IS- MDM – BI – FUSION TABLE

1 # BIG DATA       

The term  big data is  referred  from billion to trillion of records from various sources” structured  and  unstructured   data  sets beyond  DBMS to capture, store  and   analyse . Laudon.C.K.,Laudon.J.P,.(20012&2013                  

As the world is increasingly interconnected, instrumented and intelligent and in this new world the  growth in volume, variety, and velocity of data has created new challenges and opportunities. Big Data technologies can be defined  as a new generation of technologies and architectures designed to extract value economically from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis.

Big Data encompasses hardware, software, and services that integrate, organize, manage, analyze, and present data  characterized by ” three Vs”  volume, variety, velocity, and value that is the benefits from Big Data projects such as: Capital cost reduction, Operational efficiency, Business process.

VOLUME

Volume is the primary attribute of big data, that is seen as  the huge  flood of data that is generated every minute and every second. The quantity of data to be captured continues to grow exponentially.  Yes businesses are big into big data, because of availability of data mining tools that reveal interesting patterns trends with the potential to provide new insights into customer behavior, financial market activities, even water behavior.

VOLUME.

This unprecedented quantity and quality Data is generated by mobile communications, social networks, machines and sensors devices, which  continuously generate data streams without human intervention  and increase the velocity of data aggregation and processing. Ralf Konrad(2012) & Russomo Phillip (2011) agreed that variety is just as big as volume, Variety and volume tend to fuel each other .

VARIETY

viriety

Variety in big data is a critical attribute .With big data, the data streams are uncontrolled and often unstructured, come from diverse channels and are in different formats which is a key criteria in determining whether the application is considered to be big data.85% of generated data is unstructured. Agreed McAfee.A ., Brynjolfsson.E,. (2012 that Big data takes the form of messages, updates, and images posted to social networks; readings from sensors; GPS signals from cell phones, and more”. Then the challenge is to process this data in such a way that valuable information can

VELOCITY

VELOCITYVariety in big data is a critical attribute With big data, the data streams are uncontrolled and often unstructured, come from diverse channels and are in different formats which is a key criteria in determining whether the application is considered to be big data.85% of generated data is unstructured. Agreed McAfee.A ., Brynjolfsson.E,. (2012 that Big data takes the form of messages, updates, and images posted to social networks; readings from sensors; GPS signals from cell phones, and more”. Then the challenge is to process this data in such a way that valuable information can be derived from it.

Big data is described by its velocity/speed. Velocity refers to the speed at which the data is captured ,processed and produced not just in burst but in a continuous flow. As big data refers to the fact that the data is being produced not just in bursts, but in a continuous flow, resulting in many companies facing with the challenge of having to process huge amounts of data faster and faster  ideally in real time”. Laudon.C.K.,Laudon.J.P,.(20012&2013). “Real-time or nearly real-time information makes it possible for a company to be much more agile than its competitors”. McAfee .A.,  Brynjolfsson .E,.(2011)  . According to Ralf Konrad.C,.(2015), 85% to 90% of all bits and bytes are unstructured stem from various sources and must be up to date and have to be processed at high speed .So extracting relevant information is a key factor to turning the data into competitive advantage.http://www.t-systems.com. Velocity data moving through firm’s systems varies  from batch integration and loading of data at predetermined intervals to real-time streaming of data that is processed by hadoop that moment

VALUE                                                                                                                                                                                  “ value comes from knowing more than the rest” 

VALUE PIPLELINE

Value is becoming more recognizable as the fourth characteristic of big data , the fact that data stream are uncontrolled, unstructured coming from different channels in different formats, valuable information is extracted from the three VS.  Mining for data is used to find patterns and trends, Process data in real time with the right tool  such as hadoop.. allows to efficiently and timely produce valuable information from unstructured data. Transporting data  to  warehousing and  Data mart for accessible,  Store data securely where data needed will be guaranteed available ,then Refining the data to recognize patterns, trends, meaning, and correlation by using analytical tools such as Chi squared..  Value means guaranteed added value and new opportunities thanks to intelligent analysis. As new data is transformed into information, which in turn is  combined with business know-how to secure a valid basis for decision-making, and deliver Value to customers .Implementing is the mean to and end that  best practice is reached to become a responsive entity operating smartly and achieve the six operational objectives, Which mean VALUE=COMPETITIVE ADVANTAGE over competitors.

Analyzing big data in motion

motionAs with certain kinds of data( generated by sensor, fraud detection data), there is no time to store it before acting on it. Because it is constantly changing. The key to evaluating the velocity requirements of Big Data is to understand the business/organizational processes and requirements of end users. Also high-velocity, high-volume data calls for in-motion analytic.

https://www.sap.com/bin/sapcom/en_gb/downloadasset.2012-09-sep-26-13.idc-report–big-data-trends-strategies-and-sap-technology-pdf.bypassReg.html      http://public.dhe.ibm.com/common/ssi/ecm/im/en/imm14124usen/IMM14124USEN.PDF

2 #   MASTER DATA MANAGEMENT (MDM)

MDM  is synchronized enterprise-wide business data that provides definitions and identifiers of internal and external objects involved in business transactions (e.g., customer, product, reporting unit, market share). http://www.information-management.com/channels/master-data-management.html

MDM is defined as a transformative effort requiring firms to restructure their human resources, business policies internal processes and its management mind sets. Laudon (2012,2013) It is also “a strategic business driver as it enables organizations to unify and consolidate data about their customers, products and organizations; data that is often fragmented across different systems. http://www.evancarmichael.com/Small-Business-Consulting/1650/The-Importance-of-Master-Data-Management-MDM.html.

Master data(MD)

Is data that is shared across systems and used to classify transactional data. Without data integrity, transaction data cannot be analyzed or reported in a meaningful way. John Kopcke http://www.oracle.com/us/products/middleware/bus-int/064333.pdf(2008)

MDM  Enhance BIG DATA

MDM creates context for big data by providing trusted information about how incoming unstructured data fits into the business environment. big data creates context for MDM by providing new insights from social media and other sources, which helps companies build richer customer. http://public.dhe.ibm.com/common/ssi/ecm/im/en/imm14124usen/IMM14124USEN.PDF

MDM.

MDM relates to data governance (DG)which is a formal set of practices or a set of processes that ensures the importance of data assets to be formally managed throughout the enterprise, that this data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality. DG needs an clear & buy-in “top down” structure and a significant bottom up to take on duties.  http://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/sas-data-governance-framework–

Data governance   guiding Principles

mdm2

 

 

 

 

 

 

Business users need accurate, clean, timely data about their prospects, customers, competition, etc. to meet business objectives and goals. http://www.b-eye-network.com/blogs/oneal/assets_c/2012/11/DG-530.php

Once the data governance role is part of a people’s jobs, they are more likely to make better decisions about the role of data – and how it applies to the corporate mission.

Data Stewardship ,Data Security Roles & Tasks

gv11

https://www.google.ie/search?q=stewardship&biw=1352&bih=626&source=lnms&tbm=isch&sa=X&ei=ujRAVab-

Data stewards are the keepers of the data throughout the organization. They serves as the points of contact for data definitions, usability, questions, and access requests. Creating clear and unambiguous definitions of data.  Defining a range of acceptable values, such as data type and length. Enforcing the policies set by a data governance council or any other oversight board. Monitoring data quality and starting root cause investigations when problems arise. Participating in the definition and revision of data policy. Understanding the usage of data in the business units. Reporting metrics and issues to the data governance council

Data quality

chart1

 

 

 

 

 

 

 

 

 

Source: https://www.google.ie/search?q=information+quality+dimension&newwindow                                                         Data quality is data that is trusted, “fit for the purpose”, data that fall within information quality dimension.

3 # Information System (IS)

An  a system is composed of people and computers that collect, processes, store and distribute  information to support decision making and control organization with a view to achieve businesses’ short and long-term goals.

Importance of information system

“What the business would like to do in five years often depends on what IS will be able to do”Laudon.C.K.,Laudon.J.P,.(2013,2013). Businesses responsiveness in a fast moving business environment heavily depends on the quality of information system. Also IS enables business firms to achieve strategic objectives. see below.

quality

Check Laudon.C.K.,Laudon.J.P,.(2013:42-44) to read more                                         Further, when IS at a heart of a business firm, it enables to tighten linkage with suppliers and develop customers intimacy as well as Identifying and provide Solutions to challenges from business environment. See below how IS engage with firm internally and externally.

INFO SYS

Source: http://www.prenhall.com/behindthebook/0132304619/pdf/Laudon%20Feature%203.pdf

INFORMATION SYSTEMS AND CAREERS

Information system became a must know, as it is used in every career there is. career &~IS

 4 # BUSINESS INTELLIGENCE (BI)

 Business Intelligence and decision making 

Business  Intelligence is a contemporary term for data and software tools to organize, analyse and provide access to data  to support all levels of Managers & other users make more informed decision. Jane P. Laudon.C., Laudon.J.P.,(2012;2013)  MIS,DSS,ESS deliver information and knowledge to the organisational citizens to better perform their duties.

“ High quality decision require high quality information”

dds

    Source:https://www.google.ie/search?q=business+intelligence+and+analytics+systems+for+decision+support

decisions

High – velocity Automated decision making

The class of decision that are highly structured and automated is growing at a fast pace, as decision are not necessary made by humans/managers but machines . ex: solutions to querying in Google search Engine where Google  decide which URLs to display in a matter of less than a second. Computer algorithms enable automated high-speed decision making.

5 #  GOOGLE FUSION TABLE

A heat-map depicts a census results of Irish population in 2011

 1# Fusion table and heat map                                                                                           

A heat-map of Irish population density
A heat-map depicts a census results of Irish population in 2011

Google Fusion Tables is a cloud-based service for data management and integration . It enables users to upload tabular data files (spreadsheets, CSV, KML) Gonzalez,H., levy,A., Jensen,S.C,Langen,A.,Warren, S,., And that Fusion Tables supports the rendering of heat maps. http://homes.cs.washington.edu

 

2 # Heat map creation

Heat map in different colours, display Irish population density from  26  counties in 2011. Data was compiled  in excel table format save in as CSV form,  securing geocode data recognition by fusion table and   allowed import in fusion table. KML Google map that contains geographic data  was uploaded from Google drive, then emerged with Excel table. To learn more about how to compile data table in excel, create and emerge with fusion table click here https://www.youtube.com/watch?v=0SLyS4-zGeo  & https://www.youtube.com/watch?v=p0xnk9zFQpY

gallery

 

 

 

 

3 # Heat Map heat map information :Visually display  population density of each county , Dublin is in red colour and depicts the highest number of population over 1million  and 11 counties have <100,00 population, more 11 counties have <200,000 population , Cork and Galway have over 200,000 population.   From this data many purposes can be served to make strategic decisions . For example: HSE decisions makers could use the heat map visual display using legends of information from the pie chart  that hold proportion of population of each county,  then they can decide on the closure of hospitals where counties are less populated, the same as disease control in case of outbreak.  Politicians can get information from the heat map for their campaign targeting the most populated counties to convert vast majority of voters.Department of education can use the heat map to decide on school building. Super markets business development managers  can use the heat map to analyse the feasibility of opening super market in towns based on population density. 

4 # Importance of Google Fusion table

 Data visualization is one of the most powerful features of Fusion Tables is that users can visualize their data immediately after uploading, and when it comes to geographic features , rendering a massive geographic data set  is attributed to fusion table component . Also fusion table t enables users to upload data and visualize it in many  different ways. As heat map creation is a method of point data visualization that shows the density of points in a given area. http://www.sco.wisc.edu.  As Fusion Tables  supports the production  of heat maps  in line with the density of features in space , users can easily extract information needed for decision makers .Fusion table allow the integration of data from many sources by executing joins. Briefly, Fusion tables provide a tool to data owners to safely upload data to fusion tables , where all users can visually share same data in interactive way without emails traffic. Google fusion tables facilitate  users to imbed many visualisations, reveal trends, interpret information and stories. Users can easily create and publish compelling visualizations on the website. By using Google fusion tables, it allows to analyse the geographical situation and tailor solutions to any given scenario.

BUSINESS INTELLIGENCE

 

# 1 Business Intelligence (BI)

“BI  is a collection of decision support technologies for the enterprise aimed at enabling knowledge workers such as executives, managers, and analysts to make better and faster decisions”. “It is the activity of gathering information about element in the environment interacting with the firm” Stated Laodon.C.K., Laudon.J.P,. (2012, 2013) added that Intelligence consist of discovering, identifying  and understanding the problem occurring in organisation, to why the problem, where and what affect is having on the firm operation”

Business world is moving at fast-pace, Firms must have the insight and data they need in order to make the right calls at the right time. BI is the key factor to success as it the key to making those correct decisions. BI joins data, technology, analytics, and knowledge to help business professionals make the optimal decisions that drive their enterprise’s success.  BI  includes an enterprise data warehouse and a BI platform or tool set to aid those executives in transforming the data into actionable information. http://www.ngdata.com/

# 2 BI infrastructure

 

While data is pieced together from organisation’s different systems (internal data), as well as external data that comes from business environment, a contemporary infrastructure for BI has an array of tools that enable to obtain useful information from all the different data types(structured unstructured) from different sources, by managing and analysing non trading data.

BI capabilities combine tools such as Data warehouse- data Marts- Hadoop- In memory computing – Analytical platforms that cater tools such as OLAP,DATA MINING, TEXT,WEB mining to retrieve information about trends, patterns and relationships from unstructured data.

# 3 BI optimizes performance and keep costs low

“BI can have a direct positive impact on a company’s business performance, dramatically improving its ability to accomplish its mission by making smarter decisions at every level of the business from corporate strategy to operational processes.” http://timoelliott.com(2009)

While OLAP (online analytical Process)is a query oriented, it represents relationship within data , it enable users to view same data using multidimensional data model to find any answer to their queries, get on line answer rapidly.

DATA MINING is discovery-driven, gives insights into firm’s data. It is used to finds hidden patterns, relationships in large database, inferring rules from there to predict future occurrence. Data mining provides information on association, in sequence events, classification or clustering works.

With TEXT mining a sentimental software is used to measure customers’ sentiment about the company. Also attensity Analysis software is used to analyse customers’ interaction, and identify voices used expressing feed back

WEB mining provides insight about customers behaviour and evaluate effectiveness of the website , tools used include google trend, google insight to track popularity of different keywords and key phrases used in google search. Web content mining can be used to extract knowledge from content such as texts, image, audio, even video data.

# 4 Conclusion

It is advisable to rationalize the number of BI tools in profit organization or non-profit organisations as result in cost savings, more control over information, and better alignment with your busi­ness goals  and increased competitive advantage by fully exploiting the bene­fits of enterprise BI is it is guaranteed

 

References

Laodon.C.K., Laudon.J.P,.(2012,2013) Managing information systems

Managing the digital firms, twelfth &thirteen Edition,Pearson

Surajit Chaudhuri.S, Umeshwar. D, and  Narasayya.V,.                                                                                    BI technologies are essential to running today’s businesses and this technology is going through sea changes. doi:10.1145/1978542.1978562

http://timoelliott.com/blog/2009/07/implementing-business-intelligence-standards-save-money-and-improve-business-insight.html

 

Using R programming language

1 # What is R language

R programming language is helping all industries to grow,
” R programming language to become the single most important tool for computational statistics, visualization and data science”

“R is a  programming language and a collection of many inbuilt statistical and mathematical functions”. It is claimed to be a language that can handle huge amounts of data quickly as it has connectivity and compatibility with almost all types of databases and programming languages.   Norman,N,.(2015) added, “R is the most powerful and flexible statistical programming language in the world,”, It handles vectors, matrices and lists.    R is claimed to enable “computing on the language, which in turn makes it possible to write functions that take expressions as input, something that is often useful for statistical modelling and graphics”.   http://cran.r-project.org/doc/manuals/r-release/R-lang.pdf (2015)            #  2  Importance of using R                                                                                               

"R is hot and getting hotter" R can read into any f data. Be Excel sheet,Image,Text file, CSV file or Web
“R is hot and getting hotter” R can read into any file data. Be Excel sheet,Image,Text file, CSV file or Web

AS  matrix plotting in R includes powerful visualizations for matrix data, where can do map elevation from a sea level and create elevated better visual map  to a  higher matrix values, it’s because “R makes it easier to draw meaning from multidimensional data with multi-panel charts, 3-D surfaces and more. http://www.revolutionanalytics.com/what-r

3 # The power of R & Matrix visualization

volcano heat map2An example above of a heat map of a dormant volcano revealing that the volcano is  active and presents threats to ecology, people, and profits. Substantial information can be gained from it by Ecologists , Geologists, Aviation Authority, and property developers.                                                                                                        Ecologists and Geologists can gain information for further R&D work ,incuding making  environmental protection plan.  Decision makers  would be aware of the danger and to halt commercial activities around the areas . Aviation authority can plan ahead on on airspace routes.

R is also used for a summary statistics  to show  data points  distribution, it  uses functions such as mean, median, and standard deviation, and display data on graphs. Many users and writers say that “R includes virtually every data manipulation, statistical model, and chart that the modern data scientist could ever need.”http://www.revolutionanalytics.com/what-r  .                                             As  Factors  take categorical variables in R , it  help dividing  data into groups, and tracks categorized values. To learn more click here. http://www.stat.berkeley.edu/~s133/factors.html   .                                                          Not only  R merges functions, and joins data frames together by using the content of columns, it is also possible to emerging data into R console.                  As vectors handles matrices and lists, it is handy when need data in rows and columns. “Features that make R really unique and useful is  its data structures.. Click here to learn more about inserting a master excel file intp R console for simplified data  manipulation”. https://www.youtube.com/watch?v=LjuXiBjxryQ

References 

Manchlis,S,.(2013) http://www.computerworld.com/ Smith,.D,.(2015) http://www.revolutionanalytics.com Norman,N,.(2015) http://www.revolutionanalytics.com http://cran.r-project.org (2015) http://www.stat.berkeley.edu


 

 

 

Fusion table heat map

A heat-map of Irish population density
A heat-map depicts a census results of Irish population in 2011

 1# Fusion table and heat map                 Google Fusion Tables is a cloud-based service for data management and integration . It enables users to upload tabular data files (spreadsheets, CSV, KML) Gonzalez,H., levy,A., Jensen,S.C,Langen,A.,Warren, S,., And that Fusion Tables supports the rendering of heat maps. http://homes.cs.washington.edu. 

2 # Heat map creationPopulation table

Heat map in different colours, display Irish population density from  26  counties in 2011. Data was compiled  in excel table format save in as CSV form,  securing geocode data recognition by fusion table and   allowed import in fusion table. KML Google map that contains geographic data  was uploaded from Google drive, then emerged with Excel table. To learn more about how to compile data table in excel, create and emerge with fusion table click here https://www.youtube.com/watch?v=0SLyS4-zGeo  & https://www.youtube.com/watch?v=p0xnk9zFQpY                                                                                                                                                                                                                                                                                                                                      My

gallery3 # Heat Map heat map information :Visually display  population density of each county , Dublin is in red colour and depicts the highest number of population over 1million  and 11 counties have <100,00 population, more 11 counties have <200,000 population , Cork and Galway have over 200,000 population.   From this data many purposes can be served to make strategic decisions . For example: HSE decisions makers could use the heat map visual display using legends of information from the pie chart  that hold proportion of population of each county,  then they can decide on the closure of hospitals where counties are less populated, the same as disease control in case of outbreak.  Politicians can get information from the heat map for their campaign targeting the most populated counties to convert vast majority of voters.Department of education can use the heat map to decide on school building. Super markets business development managers  can use the heat map to analyse the feasibility of opening super market in towns based on population density. 

4 # Importance of Google Fusion table

 Data visualisation is one of the most powerful features of Fusion Tables is that users can visualize their data immediately after uploading, and when it comes to geographic features , rendering a massive geographic data set  is attributed to fusion table component . Also fusion table t enables users to upload data and visualize it in many  different ways. As heat map creation is a method of point data visualization that shows the density of points in a given area. http://www.sco.wisc.edu. As Fusion Tables  supports the production  of heat maps  in line with the density of features in space , users can easily extract information needed for decision makers .Fusion table allow the integration of data from many sources by executing joins. Briefly, Fusion tables provide a tool to data owners to safely upload data to fusion tables , where all users can visually share same data in interactive way without emails traffic. Google fusion tables facilitate  users to imbed many visualisations, reveal trends, interpret information and stories. Users can easily create and publish compelling visualizations on the website. By using Google fusion tables, it allows to analyse the geographical situation and tailor solutions to any given scenario.