Showing posts with label analytics. Show all posts
Showing posts with label analytics. Show all posts

Friday, April 26, 2024

Snowflake Optimization on top of any Table

There are several optimization techniques you can implement on a table to reduce storage and compute costs:

1. Partitioning: Partitioning your table can significantly improve query performance and reduce storage costs. You can partition your table based on a specific column, such as date, so that data is stored in separate parts. This can be particularly useful if you often query data for specific date ranges.

2. Indexing: Proper indexing can greatly improve query performance. However, it's important to find the right balance as too many indexes can increase storage costs and slow down write operations. Indexes should be created on columns that are frequently used in WHERE, JOIN, and ORDER BY clauses.

3. Data Compression: Depending on the database system you're using, you might be able to use data compression techniques to reduce storage costs. This can include techniques like row-level or page-level compression, or even using columnar storage for large data warehousing workloads.

4. Data Archiving and Purging: If your table contains historical data that is no longer needed for day-to-day operations, consider archiving or purging this data. This can significantly reduce storage costs and improve query performance on the remaining data.

5. Normalization: If your table contains redundant data, consider normalizing it. This involves splitting the table into two or more tables and defining relationships between them. This can reduce storage costs and improve data integrity, but it can also increase the complexity of your queries.

6. Use Appropriate Data Types: Using the appropriate data type for each column can also help reduce storage costs. For example, using a smaller integer type (like INT instead of BIGINT) can save space if the larger range of values isn't needed.

7. Column Store Indexes: If you're using a database system that supports column store indexes, these can provide significant performance improvements for read-heavy workloads. Column store indexes store data column-wise instead of row-wise, which can be more efficient for querying large datasets


Wednesday, April 17, 2024

Copilot for Power BI


Copilot for Power BI is a feature introduced by Microsoft to provide users with AI-powered assistance while creating reports and dashboards in Power BI Desktop. It uses natural language processing (NLP) capabilities to understand user queries and provide relevant suggestions and guidance in real-time as users build their data visualizations.

This feature is designed to make it easier for users, especially those who may not have extensive experience with data analytics or Power BI, to create effective and insightful reports. Copilot can suggest visualizations, recommend data transformations, and offer explanations or insights into the data being analyzed.

This feature need to be enabled in the tenant settings of the Admin Portal either at Org level or can be based on AD groups.


Requests to Copilot consume Fabric Capacity Units (CU). Copilot usage is measured by the number of tokens processed. Tokens can be thought of as pieces of words. As a reference, 1,000 tokens approximately represent 750 words. The Fabric Copilot cost is calculated per 1,000 tokens, and input and output tokens are consumed at different rates. This table defines how many CUs are consumed as part of Copilot usage.  

Operation in Metrics App 


Operation Unit of Measure 

Consumption rate 

Copilot in Fabric 

The input prompt 

Per 1,000 Tokens 

400 CU seconds 

Copilot in Fabric 

The output completion 

Per 1,000 Tokens 

1,200 CU seconds 

Capacity Units (CU) in pricing refer to the computing resources allocated to your Power BI deployment in the cloud. They are a measure of the performance and scalability of your Power BI environment.

If you’re utilizing Copilot for Power BI and your request involves 500 input tokens and 100 output tokens, then you’ll be charged a total of (500*400+100*1,200)/1,000 = 320 CU seconds in Fabric.

The cost of Fabric Capacity Units can vary depending on the region. Regardless of the consumption region where GPU capacity is utilized, customers are billed based on the Fabric Capacity Units pricing in their billing region.

For example, if a customer’s requests are mapped from region 1 to region 2, with region 1 being the billing region and region 2 being the consumption region, the customer is charged based on the pricing in region 1.

Data Security

The data such as prompts, grounding data included in prompts, and AI output will be processed and temporarily stored by Microsoft and may be reviewed by Microsoft employees for abuse monitoring.

To generate a response, Copilot uses:

  • The user's prompt or input and, when appropriate,

  • Additional data that is retrieved through the grounding process.

This information is sent to Azure OpenAI Service, where it's processed and an output is generated. Therefore, data processed by Azure OpenAI can include:

  • The user's prompt or input.

  • Grounding data.

  • The AI response or output.

Grounding data may include a combination of dataset schema, specific data points, and other information relevant to the user's current task. Review each experience section for details on what data is accessible to Copilot features in that scenario.

Interactions with Copilot are specific to each user. This means that Copilot can only access data that the current user has permission to access, and its outputs are only visible to that user unless that user shares the output with others, such as sharing a generated Power BI report or generated code. Copilot doesn't use data from other users in the same tenant or other tenants.

Copilot uses Azure OpenAI—not OpenAI's publicly available services—to process all data, including user inputs, grounding data, and Copilot outputs. Copilot currently uses a combination of GPT models, including GPT 3.5. Microsoft hosts the OpenAI models in Microsoft's Azure environment and the Service doesn't interact with any services by OpenAI (for example, ChatGPT or the OpenAI API). Your data isn't used to train models and isn't available to other customers.

Copilot in Power BI Service for Consumers


New Report/Page

Create a blank report by picking any published semantic model. In the copilot tab you can give the input prompt that best describes the business needs that you want to fulfill for the report that you want to create.


The semantic model that we are intended to use for creating any new report needs to be built with all best practices to get the best out of the copilot. The terminologies used in the semantic model and the naming conventions used should be user friendly so that Azure Open AI can recognize.


Summarize Visuals

You can summarize the report visuals into bullet points or you can generate high level gist of the contents for quick references. Most of the cases these quick insights generated on top of the existing reports can be used for executive reporting and for presentations.


Copilot in Power BI Desktop for Developers

If you are unable to see the copilot in the tool bar in Power BI desktop, then you have to explicitly enable it from the Options as in below


EU Data Boundary

Customers can configure their service to be in-scope for the EU Data Boundary by provisioning their tenant and all Microsoft Fabric capacities in an EU datacenter location. Customer Data and pseudonymized personal data is stored and processed in the EU Data Boundary aside from specific residual transfers that are documented in Services that transfer a subset of Customer Data or pseudonymized personal data out of the EU Data Boundary on an ongoing basis.

Fabric also enables the option to select an Azure region where Customer Data is stored when creating new Microsoft Fabric capacity. The default option listed is your tenant home region. If you select that region, all associated data, including Customer Data, is stored in that Geo. If you select a different region, some Customer Data is still stored in the home Geo. By selecting a region in the EU, Customer Data will be stored in the EU Data Boundary.

MS Fabric Capacity Metrics

Starting from February 2024, you can view the total capacity usage for Copilot under the operation name “Copilot in Fabric” in your Fabric Capacity Metrics App.


Saturday, September 9, 2023

Migration of Traditional Reports to Power BI - Quick Guide

Moving reports from traditional analytical tools to Power BI represents a significant stride in harnessing advanced visualization and data analysis capabilities. Below is an initial guide to assist you in the process.

Evaluation and Strategy:
  1. Ascertain which reports to transfer, prioritizing them based on their business significance and complexity.
  2. Gain insight into the data sources utilized in current reports. Determine if they are compatible with Power BI or proceed with prerequisites and pre-processing to achieve the state.
  3. Clearly outline the migration's scope and objectives. Consider whether you're merely relocating reports or if there are opportunities to enhance them using Power BI's functionalities.
Data Readiness:
  1. Ensure that your data sources possess appropriate structure and have been thoroughly cleansed. Power BI performs optimally with well-organized, clean data.
  2. If your data resides in a database, confirm that you possess the requisite credentials and permissions to access it via Power BI.
Power BI Desktop:
  1. Download and install Power BI Desktop if you haven't already. This is the tool you'll employ to design and craft your reports.
  2. Launch Power BI Desktop and acquaint yourself with its interface and capabilities.
Establishing Data Connections:
  1. Within Power BI Desktop, navigate to the "Home" tab and select "Get Data." Choose the relevant data source (e.g., Excel, SQL Server).
  2. Connect to your data source by supplying the necessary connection particulars.
  3. Opt for data import or establish a direct connection, depending on your data source and specific requirements.
Data Transformation:
  1. Utilize Power Query Editor (accessible from the "Home" tab) to modify and structure your data as required.
  2. Undertake data cleansing, establish calculated columns, and implement any necessary transformations.
Report Development:
  1. Construct your reports by dragging and dropping visuals onto the canvas.
  2. Customize the visuals using the formatting choices found in the "Format" and "Visualizations" panels.
  3. Establish relationships between tables if your data necessitates such connections.
DAX Formulas:
  1. Familiarize yourself with Data Analysis Expressions (DAX), Power BI's formula language used for crafting custom calculations.
  2. Compose DAX calculations to derive insights that may not be readily accessible within your data source.
Report Design:
  1. Concentrate on creating reports that are visually captivating and easily comprehensible.
  2. Exploit Power BI's array of visualization options to effectively convey insights.
Interactive Elements:
  1. Leverage Power BI's interactive capabilities, including drill-through functionality, slicers, and bookmarks, to augment user engagement.
  1. After finalizing your report, save it as a Power BI Desktop (.pbix) file.
  2. Log in to your Power BI account (or sign up for a free account if you don't have one).
  3. Select the "Publish" button to upload your report to the Power BI service.
Sharing and Collaboration:
  1. Configure suitable permissions to govern report access. Use AD groups to provide access to the workspaces and any published App for the business.
  2. Disseminate the report to colleagues and stakeholders via direct links, embedded reports, or dashboards. 
Maintenance and Oversight:
  1. Make sure the scheduled refreshes are working fine and the data quality checks are in place.
  2. Monitor report usage and gather feedback to facilitate continuous enhancement.

Bear in mind that transitioning reports to Power BI might entail a learning curve, especially for newcomers to the platform. Exercise patience and make the most of the abundant online resources, encompassing tutorials, forums, and documentation, to aid in your journey.

For starting the Power BI journey as a Novice , refer to my previous post on training/learning library

Wednesday, September 6, 2023

Quick Start with Power BI - No prerequisites

    I have been in the IT space for almost 15 years now and out of all available analytical tools like         
Oracle Analytics, Tableau, Google Analytics, Cognos and Power BI , I would easily pick Power BI to fulfill all my needs. And the reason is Microsoft.

     Power BI is widely used in businesses and organizations to gain insights from their data and make data-driven decisions. 

I would like to suggest the below path and libraries for learning the Power BI in a fast paced learning path.
  1. Download the Power BI Desktop from the Microsoft store, it is free for all windows users.(No luck for mac users yet). Make sure you only download the Desktop application.
  2. This is the best in detail learning course you can get in the Udemy for references and I had great time in learning through this course.
  3. Learning Power BI will unlock lot of opportunities in the Business Analytics vertical and having good knowledge inn SQL and MS-Excel is obviously a plus.
  4. Learning power query in transforming the source data is a key factor in flourishing in Power BI as that is where most of your brain work goes into. (Data Modelling).
  5. I would segregate Power BI into 2 categories as Easy part and Hard part.
    • Easy part
      • Get Data from data sources as Microsoft has good integration with almost all available technologies including Snowflake.
      • Check the data quality and transform the data to fulfill basic needs in generating the insights.
      • Power BI Service where you can see the published workspace and managing the data of your organisation with corporate license from Microsoft.
      • Power Query which can be generated with usual navigational features available in the Power BI desktop tool.
      • Security management of your reports and data with in Power BI cloud.
      • Refreshing your data online using gateway connections and scheduled refreshes.
    • Hard part
      • Writing DAX queries to fulfill advanced requirements using custom measures and attributes.
      • Power BI report builder which is used to generate pixel perfect reports also known as Paginated Reports.
      • Transforming the data by massaging the source data to the actual requirement of the report and insights which are to be fulfilled.
  6. As Microsoft is already a most trusted vendor across all formats of IT market, the integration with different systems are made easy and new preview features are released very frequently for the developers to explore and leverage most of out of Power BI platform.
  7. There are so many free resources online for Power BI and you can see mainstream developers in youtube with many tutorials.
  8. These day with so many visual changes and preview features Microsoft has stepped up into the nextGen reporting as Microsoft Fabrics.

Note:- Remember that Power BI can be a powerful tool, but it may take some time to become proficient. Start with simple projects and gradually work your way up to more complex ones as you gain confidence and experience.

References :-

Published by K@run@

Wednesday, December 20, 2017

The Analytics Divide - Critical Capabilities for Analytics Powered by Big Data - Part 1

Competitive intensity has increased across industries in recent times. Companies are being driven to deliver a consistent stream of market successes via innovative business models and products or improved processes that continually enhance competitive advantage. Analytics powered by big data has been the propelling force behind this wave of innovation, and executives across industries are being challenged to replicate the ubiquitous success stories achieved with analytics.
However, the hype around analytics successes is tending to gloss over the critical enablers and hard work necessary to reach the end of that rainbow. Latent in that hype is an alternative reality where most companies are actually still struggling to figure out how to use analytics to take advantage of their data.
There is a deep analytical divide in the industry, which needs to be recognized. It can perhaps be explained only in the relative maturity of the prior BI programs of these analytical ‘haves’ where the critical enablers were more or less already in place. Most other organizations today have an analytics vision, but lack an analytics strategy backed up by a practical plan to get there. According to an MIT Sloane survey, only 30 percent of respondents overall declared having a formal long-term big data and analytics plan. As big data capabilities continue to become an enterprise enabler, those who have waited cannot remain in the harbor forever.
How to engineer the bridge over this divide is an extremely relevant topic today for discussion. In this three-blog series, I will analyze the enablers and impediments of big data adoption, identify the possibilities and priorities the industry has set regarding the big data and analytics domain, and look at the prevalent patterns and practices in the different journeys organizations are undertaking. I will share an incremental adoption roadmap based on these elements that will attempt to address the concerns that are holding back organizations, and I will suggest a reference architecture that supports that incremental build to support more advanced capabilities and progressive complexities of a big data capability.
"There is a deep analytical divide in the industry, which needs to be recognized. It can perhaps be explained only in the relative maturity of the prior BI programs of these analytical ‘haves’ where the critical enablers were more or less already in place."

Enablers and impediments to analytics success

The Harvard Business Review explains that, at this point in the evolution of big data, the challenges for most companies are not related to technology. While gaining technology capabilities poses a challenge to adopting big data in the enterprise, many other factors play a big role, including culture, strategy, skills, and internal investments. Here are some key drivers and impediments to success with big data:
Data-driven culture: The previously mentioned MIT Sloane survey explains that most companies are not prepared for the robust investment and cultural changes that are required to achieve sustained success with analytics, including expanding the skill set of managers who use data, broadening the types of decisions influenced by data, and cultivating decision-making that blends analytical insights with intuition.
Deployment challenges: Leveraging the potential of predictive models has quite a few practical challenges. An article from explains that an analytical model has to produce consistent and repeatable results across the entire spectrum of input conditions and be simple enough to be deployed across all the operations impacted by the model. It has to be robust and responsive to changes in the business environment while operating within the limitations and constraints faced by the business, abide by all regulations that apply within the scope, and be intuitively explainable to management as well as to the frontline agent who, in turn, has to explain the outcome to a customer or a partner.
Strategic analytics plan: Companies that are successful with analytics are also much more likely to have a strategic plan for analytics, and this plan is usually aligned with the organization’s overall corporate strategy. These companies use analytics more broadly across the organization, and they are able to measure the results of their analytical efforts. The previously mentioned MIT Sloane survey highlights that the companies that have pulled away from the pack, “the Analytical Innovators,” are five times more likely to have a formal strategy for analytics than the least mature group. These companies recognize that they need to put in place a robust analytics culture. Data analytics is used by their C-suite for providing strategic direction to the whole organization and used by middle management to improve day-to-day operation of the organization.
Data privacy concerns: One of the biggest data challenges is around privacy and what is shared versus what is not shared. Self-service data access and broad data exploration that are crucial for analytics are also inherently risky in terms of privacy violations and compliance infractions. To avoid these problems, data governance policies need to be updated or extended to encompass data from the organization’s data lake, and users should be trained in how the policies affect their work with data in the lake. But there are very few data management professionals available for hiring who have prior experience with data lakes and Hadoop to frame these policies and implement them.
Skill gap: Big data technologies require a skill set that is new to most IT departments, which need expert data engineers to integrate all the relevant internal and external sources of data. Data scientists in a big data team should be comfortable speaking the language of business and helping leaders reformulate their challenges in ways that big data can tackle. In a world that’s flooded with data, it has become harder to use this data: there’s too much of it to make sense of unless the analysis starts with an insight or hypothesis to test. Here, the role of domain specialists has become absolutely essential for asking the right questions. People with these skills are hard to find and in great demand.
Drying up investments: As the hype around big data has ebbed down, it increasingly requires the same expectations for results as other IT projects. Where companies previously have been willing to fill data lakes with big data projects, executives may now want to see tangible business results faster to justify the initial and ongoing organizational investment in these projects.
These success factors are largely preventing many organizations from embarking on their big data and analytics journey. Yet, an agile management culture tuned to rapidly changing market conditions is going to be a pre-requisite to survival, if not success, in the next decade. Closing that capability gap is becoming mandatory for those that have yet to embrace analytics.
The requisite capabilities can only be gained through a managed transformation, an incremental build up in a phased approach where the big data journey is mapped in clear, achievable but increasingly challenging milestones. Here, success in each phase brings in capabilities required for the next level of complexity in terms of implementation complexity and organization change management.
In the next part of this blog, I will talk about the incremental complexities and more advanced capabilities and skills needed in inducting the different nature and types of big data, and the evolving architecture patterns needed to support that progressive complexity.

Source :- Suman Ghosh (Center of Excellence at TCS)
URL: :-