You have already completed the Test before. Hence you can not start it again.
Test is loading...
You must sign in or sign up to start the Test.
You have to finish following quiz, to start this Test:
Your results are here!! for" Snowflake SnowPro Specialty Snowpark Practice Test 1 "
0 of 60 questions answered correctly
Your time:
Time has elapsed
Your Final Score is : 0
You have attempted : 0
Number of Correct Questions : 0 and scored 0
Number of Incorrect Questions : 0 and Negative marks 0
Average score
Your score
Snowflake SnowPro Specialty Snowpark
You have attempted: 0
Number of Correct Questions: 0 and scored 0
Number of Incorrect Questions: 0 and Negative marks 0
You can review your answers by clicking on “View Answers” option. Important Note : Open Reference Documentation Links in New Tab (Right Click and Open in New Tab).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Answered
Review
Question 1 of 60
1. Question
What is the following sequence doing? (Choose one)
_ = session.sql(“create temp stage s“).collect() _ = session.file.put(“tests/f.csv“, “@s“) with session.file.get_stream(“@s/f.csv.gz“, decompress=True) as fd: assert fd.read(5) == b“1,one“
Correct
create temp stage s creates a temporary Snowflake stage
session.file.put(“tests/f.csv“, “@s“) uploads a local CSV file to the stage
When files are uploaded to a Snowflake stage, they are automatically compressed (for example, to .gz)
The assertion confirms that the decompressed content starts with b“1,one“, proving that the file was uploaded, compressed, then downloaded and decompressed via a stream.
session.file returns a FileOperation object, which supports file upload, download, and streaming operations in Snowpark.
Incorrect
create temp stage s creates a temporary Snowflake stage
session.file.put(“tests/f.csv“, “@s“) uploads a local CSV file to the stage
When files are uploaded to a Snowflake stage, they are automatically compressed (for example, to .gz)
The assertion confirms that the decompressed content starts with b“1,one“, proving that the file was uploaded, compressed, then downloaded and decompressed via a stream.
session.file returns a FileOperation object, which supports file upload, download, and streaming operations in Snowpark.
Unattempted
create temp stage s creates a temporary Snowflake stage
session.file.put(“tests/f.csv“, “@s“) uploads a local CSV file to the stage
When files are uploaded to a Snowflake stage, they are automatically compressed (for example, to .gz)
The assertion confirms that the decompressed content starts with b“1,one“, proving that the file was uploaded, compressed, then downloaded and decompressed via a stream.
session.file returns a FileOperation object, which supports file upload, download, and streaming operations in Snowpark.
Question 2 of 60
2. Question
Which of the following is a valid way to call a select method on a df dataframe with a col1 column, assuming the col function was properly imported? (choose one)
Which of the following DataFrame calls are lazy evaluated? (choose two)
Correct
A Snowpark DataFrame is lazily evaluated, which means the SQL statement isnt sent to the server for execution until you perform an action. An action causes the DataFrame to be evaluated and sends the corresponding SQL statement to the server for execution. The following methods perform an action: collect, show, count and save_on_table (from a writer). limit and filter are here just some transformation methods as most other DataFrame methods – that simply register instructions in the resulting client DataFrame object, without generating or executing any SQL query on the server side. It is important to see that unlike in pandas – most such methods do not deal with in-place modifications, but create and return another DataFrame object. Youll have at least 1-2 questions where you should prove you properly understand what the action methods in Snowpark are. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/working-with-dataframes#label-snowpark-python-dataframe-action-method
Incorrect
A Snowpark DataFrame is lazily evaluated, which means the SQL statement isnt sent to the server for execution until you perform an action. An action causes the DataFrame to be evaluated and sends the corresponding SQL statement to the server for execution. The following methods perform an action: collect, show, count and save_on_table (from a writer). limit and filter are here just some transformation methods as most other DataFrame methods – that simply register instructions in the resulting client DataFrame object, without generating or executing any SQL query on the server side. It is important to see that unlike in pandas – most such methods do not deal with in-place modifications, but create and return another DataFrame object. Youll have at least 1-2 questions where you should prove you properly understand what the action methods in Snowpark are. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/working-with-dataframes#label-snowpark-python-dataframe-action-method
Unattempted
A Snowpark DataFrame is lazily evaluated, which means the SQL statement isnt sent to the server for execution until you perform an action. An action causes the DataFrame to be evaluated and sends the corresponding SQL statement to the server for execution. The following methods perform an action: collect, show, count and save_on_table (from a writer). limit and filter are here just some transformation methods as most other DataFrame methods – that simply register instructions in the resulting client DataFrame object, without generating or executing any SQL query on the server side. It is important to see that unlike in pandas – most such methods do not deal with in-place modifications, but create and return another DataFrame object. Youll have at least 1-2 questions where you should prove you properly understand what the action methods in Snowpark are. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/working-with-dataframes#label-snowpark-python-dataframe-action-method
Question 4 of 60
4. Question
Which of the following is a valid action method call in Snowpark Python, on a DataFrame? (choose one)
Which of the following in a valid schema definition for a Snowpark DataFrame, assuming that all the import statements have been called already for all these types? (choose one)
Correct
We must have a structure with fields, i.e. a StructType of StructField definitions. Remark also that all LongType, StringType etc calls must be followed by parentheses. You may get many schema definitions like this one, usually in the context of some more complex scenarios. The same Snowpark logical types can be passed in UDF and procedure definitions, for input parameters and returned values. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/types
Incorrect
We must have a structure with fields, i.e. a StructType of StructField definitions. Remark also that all LongType, StringType etc calls must be followed by parentheses. You may get many schema definitions like this one, usually in the context of some more complex scenarios. The same Snowpark logical types can be passed in UDF and procedure definitions, for input parameters and returned values. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/types
Unattempted
We must have a structure with fields, i.e. a StructType of StructField definitions. Remark also that all LongType, StringType etc calls must be followed by parentheses. You may get many schema definitions like this one, usually in the context of some more complex scenarios. The same Snowpark logical types can be passed in UDF and procedure definitions, for input parameters and returned values. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/types
Question 6 of 60
6. Question
What can you have in the signature of a Python function used for a vectorized UDF? (choose two)
Correct
Vectorized Python UDFs let you define Python functions that receive batches of input rows as Pandas DataFrames and return batches of results as Pandas arrays or Series. You may be asked 2-3 questions about vectorized Python UDFs, and you must know the essentials. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Incorrect
Vectorized Python UDFs let you define Python functions that receive batches of input rows as Pandas DataFrames and return batches of results as Pandas arrays or Series. You may be asked 2-3 questions about vectorized Python UDFs, and you must know the essentials. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Unattempted
Vectorized Python UDFs let you define Python functions that receive batches of input rows as Pandas DataFrames and return batches of results as Pandas arrays or Series. You may be asked 2-3 questions about vectorized Python UDFs, and you must know the essentials. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Question 7 of 60
7. Question
A DataFrame with three a, b and c columns has one single row with a None/NaN and a NULL value in a and c (not b). What statements will remove the row from the DataFrame? (choose three)
Correct
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. Hence A and B are selected use cases. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. So D is also valid, because a and c they both have NaN or NULL in that row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. This is a major data cleanup method on a DataFrame you must learn about. Its likely to get even 2-3 questions about it, because it is a bit tricky, with its default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna
Incorrect
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. Hence A and B are selected use cases. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. So D is also valid, because a and c they both have NaN or NULL in that row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. This is a major data cleanup method on a DataFrame you must learn about. Its likely to get even 2-3 questions about it, because it is a bit tricky, with its default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna
Unattempted
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. Hence A and B are selected use cases. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. So D is also valid, because a and c they both have NaN or NULL in that row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. This is a major data cleanup method on a DataFrame you must learn about. Its likely to get even 2-3 questions about it, because it is a bit tricky, with its default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna
Question 8 of 60
8. Question
A DataFrame with three a, b and c columns has one single row with a None/NaN and a NULL value in a and c (not b). What statements will remove the row from the DataFrame? (choose one)
Correct
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. But A will fill those values with something valid, so it does not apply. B calls fillna with an empty dictionary, which returns the same DataFrame unchanged. And dropna will find at least one NaN or NULL cell, so it will remove the row. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. And while fillna will replace any invalid value in a, dropna will no longer be able to remove the row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. These are major data cleanup methods on a DataFrame you must learn about. Its likely to get even 2-3 question about them, because they are a bit tricky, with default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.fillna#snowflake.snowpark.DataFrame.fillna
Incorrect
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. But A will fill those values with something valid, so it does not apply. B calls fillna with an empty dictionary, which returns the same DataFrame unchanged. And dropna will find at least one NaN or NULL cell, so it will remove the row. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. And while fillna will replace any invalid value in a, dropna will no longer be able to remove the row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. These are major data cleanup methods on a DataFrame you must learn about. Its likely to get even 2-3 question about them, because they are a bit tricky, with default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.fillna#snowflake.snowpark.DataFrame.fillna
Unattempted
By default, dropna is called with how=any, which means that it will delete any row with at least one NaN or NULL value. But A will fill those values with something valid, so it does not apply. B calls fillna with an empty dictionary, which returns the same DataFrame unchanged. And dropna will find at least one NaN or NULL cell, so it will remove the row. With how=all, all values from the selected columns must be NaN or NULL for the rows to be deleted. So C is excluded. By selecting just a subset of columns, the rules will apply now only to these columns. And while fillna will replace any invalid value in a, dropna will no longer be able to remove the row. The thresh parameter tells the minimum number of columns with no NaN or NULL values that allow the row to be kept. So E is not valid, because we have a non-NULL value in b. These are major data cleanup methods on a DataFrame you must learn about. Its likely to get even 2-3 question about them, because they are a bit tricky, with default parameter values and specific arguments. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.dropna https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.2.0/api/snowflake.snowpark.DataFrame.fillna#snowflake.snowpark.DataFrame.fillna
Question 9 of 60
9. Question
When should you use pandas on Snowflake? (choose one)
Correct
You should use pandas on Snowflake if any of the following is true (check the link below): · You are familiar with the pandas API and the broader PyData ecosystem. · You work in a team with others who are familiar with pandas and want to collaborate on the same codebase. · You have existing code written in pandas. · Your workflow has order-related needs, as supported by pandas DataFrames. For example, you need the dataset to be in the same order for the entire workflow. · You prefer more accurate code completion from AI-based copilot tools. There are three major DataFrame APIs you should learn about for this exam and understand the main differences between: (1) the Snowpark DataFrame API, (2) the new pandas on Snowflake API, (3) the external pandas library. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/pandas-on-snowflake#when-should-i-use-pandas-on-snowflake
Incorrect
You should use pandas on Snowflake if any of the following is true (check the link below): · You are familiar with the pandas API and the broader PyData ecosystem. · You work in a team with others who are familiar with pandas and want to collaborate on the same codebase. · You have existing code written in pandas. · Your workflow has order-related needs, as supported by pandas DataFrames. For example, you need the dataset to be in the same order for the entire workflow. · You prefer more accurate code completion from AI-based copilot tools. There are three major DataFrame APIs you should learn about for this exam and understand the main differences between: (1) the Snowpark DataFrame API, (2) the new pandas on Snowflake API, (3) the external pandas library. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/pandas-on-snowflake#when-should-i-use-pandas-on-snowflake
Unattempted
You should use pandas on Snowflake if any of the following is true (check the link below): · You are familiar with the pandas API and the broader PyData ecosystem. · You work in a team with others who are familiar with pandas and want to collaborate on the same codebase. · You have existing code written in pandas. · Your workflow has order-related needs, as supported by pandas DataFrames. For example, you need the dataset to be in the same order for the entire workflow. · You prefer more accurate code completion from AI-based copilot tools. There are three major DataFrame APIs you should learn about for this exam and understand the main differences between: (1) the Snowpark DataFrame API, (2) the new pandas on Snowflake API, (3) the external pandas library. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/pandas-on-snowflake#when-should-i-use-pandas-on-snowflake
Question 10 of 60
10. Question
Which of the following operations can do conversions between Snowpark DataFrames and Snowpark pandas DataFrames? (choose two)
What statements are true about the vectorized Python UDFs? (choose two)
Correct
Vectorized Python UDFs have only the potential for better performance, if your Python code operates efficiently on batches of rows. You do not need to change your caller SQL queries, because only the Python signature internally is different. They can take Pandas DataFrame as input, and return Pandas arrays or Series objects. The @vectorized decorator is declared in an internal _snowflake module, available only on the server, in Snowflake‘s virtual warehouses. And yes, a max_batch_size parameter could limit each DataFrame to a maximum of rows. Learn the basics of the vectorized Python UDFs, as youll have one or two questions about them. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Incorrect
Vectorized Python UDFs have only the potential for better performance, if your Python code operates efficiently on batches of rows. You do not need to change your caller SQL queries, because only the Python signature internally is different. They can take Pandas DataFrame as input, and return Pandas arrays or Series objects. The @vectorized decorator is declared in an internal _snowflake module, available only on the server, in Snowflake‘s virtual warehouses. And yes, a max_batch_size parameter could limit each DataFrame to a maximum of rows. Learn the basics of the vectorized Python UDFs, as youll have one or two questions about them. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Unattempted
Vectorized Python UDFs have only the potential for better performance, if your Python code operates efficiently on batches of rows. You do not need to change your caller SQL queries, because only the Python signature internally is different. They can take Pandas DataFrame as input, and return Pandas arrays or Series objects. The @vectorized decorator is declared in an internal _snowflake module, available only on the server, in Snowflake‘s virtual warehouses. And yes, a max_batch_size parameter could limit each DataFrame to a maximum of rows. Learn the basics of the vectorized Python UDFs, as youll have one or two questions about them. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-batch
Question 14 of 60
14. Question
You call df.write.mode(ignore).save_as_table(table1) for a table that already exists. What will be the outcome? (choose one)
You create a Snowpark stored procedure from a Python function with the following type hints: def p(Session: session, x: str, i: int). What signature will the Snowflake procedure have? (choose one)
You create a Snowpark UDF based on the following Python function: def run(file_path): with SnowflakeFile.open(file_path, ‘rb‘) as f: return imagehash.average_hash(Image.open(f)) How can you pass the path to a staged file to the UDF? (choose one)
What happens after a cache_result call on a DataFrame? (choose three)
Correct
A call to cache_result caches the content of the current DataFrame to create a new cached Table DataFrame. All subsequent operations on the returned cached DataFrame are performed on the cached data and have no effect on the original DataFrame. You can use Table.drop_table() or the with statement to clean up the cached result when its not needed. An error will be thrown if a cached result is cleaned up and its used again, or any other DataFrames derived from the cached result are used again. There could be more than one question about cache_result, as Snowflake finds this method very important, to reuse data already transformed. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.cache_result
Incorrect
A call to cache_result caches the content of the current DataFrame to create a new cached Table DataFrame. All subsequent operations on the returned cached DataFrame are performed on the cached data and have no effect on the original DataFrame. You can use Table.drop_table() or the with statement to clean up the cached result when its not needed. An error will be thrown if a cached result is cleaned up and its used again, or any other DataFrames derived from the cached result are used again. There could be more than one question about cache_result, as Snowflake finds this method very important, to reuse data already transformed. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.cache_result
Unattempted
A call to cache_result caches the content of the current DataFrame to create a new cached Table DataFrame. All subsequent operations on the returned cached DataFrame are performed on the cached data and have no effect on the original DataFrame. You can use Table.drop_table() or the with statement to clean up the cached result when its not needed. An error will be thrown if a cached result is cleaned up and its used again, or any other DataFrames derived from the cached result are used again. There could be more than one question about cache_result, as Snowflake finds this method very important, to reuse data already transformed. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.cache_result
Question 19 of 60
19. Question
You execute the code below: df1 = session.create_dataframe( [[1, 2], [3, 4], [5, 6]], schema=[“a“, “b“]) df2 = session.create_dataframe( [[1, 7], [3, 8]], schema=[“a“, “c“]) df1.natural_join(df2, how=“…“).show() What value should be passed in the last call, in the “how“ parameter, to get the result table below? (choose one) —————– |“A“ |“B“ |“C“ | —————– |1 |2 |7 | |3 |4 |8 | |5 |6 |NULL |
Correct
A natural join in on columns with the same name, so on the columns a here. The matches are: [1, 2] from df1 matches with [1, 7] from df2 returning the first [1, 2, 7] row. Then [3, 4] from df1 matches with [3, 8] from df2 returning the second [3, 4, 8] row. Then we have [5, 6] from df1 with no match in df2 returning the last [5, 6, NULL] row. This is an OUTER LEFT join, that includes everything from df1, and only matches from df2. As expected, you will have 2-3 questions about joining data frames in Snowpark. Pay attention to the how parameter values, including methods like, join, cross_join and natural_join. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.natural_join
Incorrect
A natural join in on columns with the same name, so on the columns a here. The matches are: [1, 2] from df1 matches with [1, 7] from df2 returning the first [1, 2, 7] row. Then [3, 4] from df1 matches with [3, 8] from df2 returning the second [3, 4, 8] row. Then we have [5, 6] from df1 with no match in df2 returning the last [5, 6, NULL] row. This is an OUTER LEFT join, that includes everything from df1, and only matches from df2. As expected, you will have 2-3 questions about joining data frames in Snowpark. Pay attention to the how parameter values, including methods like, join, cross_join and natural_join. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.natural_join
Unattempted
A natural join in on columns with the same name, so on the columns a here. The matches are: [1, 2] from df1 matches with [1, 7] from df2 returning the first [1, 2, 7] row. Then [3, 4] from df1 matches with [3, 8] from df2 returning the second [3, 4, 8] row. Then we have [5, 6] from df1 with no match in df2 returning the last [5, 6, NULL] row. This is an OUTER LEFT join, that includes everything from df1, and only matches from df2. As expected, you will have 2-3 questions about joining data frames in Snowpark. Pay attention to the how parameter values, including methods like, join, cross_join and natural_join. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.DataFrame.natural_join
Question 20 of 60
20. Question
You upload the following JSON data from a staged file into a car_sales Snowflake table with one src VARIANT column, using the STRIP_OUTER_ARRAY option True: [ { “customer“ : [{“name“: “Joyce Ridgely“}] }, { “customer“ : [{“name“: “Bradley Greenbloom“}] } ] How do you get all customer names from the whole table data? (choose one)
Correct
Because STRIP_OUTER_ARRAY = TRUE is used, each top-level element of the JSON array becomes a separate table row. As a result, the table contains two rows, and in each row:
src is a JSON object
customer is an array of objects, not a single object
Important Snowpark rules for semi-structured data:
Snowpark does not support dot notation (src.customer.name) for JSON traversal
The top-level column must be referenced using df[“src“] (or col(“src“))
JSON fields and array elements must be accessed using square brackets
Why Option B is correct:
df[“src“] correctly references the VARIANT column
[“customer“] accesses the customer array
[0] selects the first (and only) object in the array
[“name“] extracts the customer name
Why the others are incorrect:
A: Treats customer as an object instead of an array
C: Uses SQL-style dot notation, which is not supported in Snowpark Python
D: Fails to reference the top-level column (src) using the Snowpark DataFrame syntax
Working with JSON in Snowpark often requires careful attention to array vs. object structure and correct traversal syntax.
Incorrect
Because STRIP_OUTER_ARRAY = TRUE is used, each top-level element of the JSON array becomes a separate table row. As a result, the table contains two rows, and in each row:
src is a JSON object
customer is an array of objects, not a single object
Important Snowpark rules for semi-structured data:
Snowpark does not support dot notation (src.customer.name) for JSON traversal
The top-level column must be referenced using df[“src“] (or col(“src“))
JSON fields and array elements must be accessed using square brackets
Why Option B is correct:
df[“src“] correctly references the VARIANT column
[“customer“] accesses the customer array
[0] selects the first (and only) object in the array
[“name“] extracts the customer name
Why the others are incorrect:
A: Treats customer as an object instead of an array
C: Uses SQL-style dot notation, which is not supported in Snowpark Python
D: Fails to reference the top-level column (src) using the Snowpark DataFrame syntax
Working with JSON in Snowpark often requires careful attention to array vs. object structure and correct traversal syntax.
Unattempted
Because STRIP_OUTER_ARRAY = TRUE is used, each top-level element of the JSON array becomes a separate table row. As a result, the table contains two rows, and in each row:
src is a JSON object
customer is an array of objects, not a single object
Important Snowpark rules for semi-structured data:
Snowpark does not support dot notation (src.customer.name) for JSON traversal
The top-level column must be referenced using df[“src“] (or col(“src“))
JSON fields and array elements must be accessed using square brackets
Why Option B is correct:
df[“src“] correctly references the VARIANT column
[“customer“] accesses the customer array
[0] selects the first (and only) object in the array
[“name“] extracts the customer name
Why the others are incorrect:
A: Treats customer as an object instead of an array
C: Uses SQL-style dot notation, which is not supported in Snowpark Python
D: Fails to reference the top-level column (src) using the Snowpark DataFrame syntax
Working with JSON in Snowpark often requires careful attention to array vs. object structure and correct traversal syntax.
Question 21 of 60
21. Question
You load the following array of JSON data into the src column of a DataFrame: [{ “dealership“ : “Valley View Auto Sales“, “vehicle“ : [{“make“: “Honda“, “price“: “20275“}] },{ “dealership“ : “Tindel Toyota“, “vehicle“ : [{“make“: “Toyota“, “price“: “23500“}] }] Apply a 10% tax on the price of each vehicle. (choose one)
From the following JSON array, loaded in a src column of a DataFrame, show the name and phone on separate rows, for each customer: (choose one) [{ “customer“: [{“name“: “Joyce Ridgely“, “phone“: “16504378889“}] },{ “customer“: [{“name“: “Bradley Green“, “phone“: “12127593751“}] }]
You define the log level as it follows: ALTER ACCOUNT SET LOG_LEVEL = FATAL; ALTER FUNCTION MyPythonUDF SET LOG_LEVEL = INFO; What is the effective log level for the function? (choose one)
Which of the following statements is correct? (choose one)
Correct
Snowflake includes a default event table named SNOWFLAKE.TELEMETRY.EVENTS, which is also active. You can have only one active event table at a time in the account. You cannot change the inner structure of an event table. You do not define columns when you create an event table. To make an event table active, call ALTER ACCOUNT and pass the fully-qualified name of the table in the EVENT_TABLE parameter. References: https://docs.snowflake.com/en/developer-guide/logging-tracing/event-table-setting-up#label-logging-event-table-default
Incorrect
Snowflake includes a default event table named SNOWFLAKE.TELEMETRY.EVENTS, which is also active. You can have only one active event table at a time in the account. You cannot change the inner structure of an event table. You do not define columns when you create an event table. To make an event table active, call ALTER ACCOUNT and pass the fully-qualified name of the table in the EVENT_TABLE parameter. References: https://docs.snowflake.com/en/developer-guide/logging-tracing/event-table-setting-up#label-logging-event-table-default
Unattempted
Snowflake includes a default event table named SNOWFLAKE.TELEMETRY.EVENTS, which is also active. You can have only one active event table at a time in the account. You cannot change the inner structure of an event table. You do not define columns when you create an event table. To make an event table active, call ALTER ACCOUNT and pass the fully-qualified name of the table in the EVENT_TABLE parameter. References: https://docs.snowflake.com/en/developer-guide/logging-tracing/event-table-setting-up#label-logging-event-table-default
Question 28 of 60
28. Question
What happens when you call Session.add_import() for a UDF? (choose two)
Although the API looks very similar to scikit-learn, this pipeline is built using the Snowpark ML Modeling API, not sklearn or core Snowpark.
Key points:
Snowpark ML is a separate Python library from Snowpark
It lives under the snowflake.ml.modeling namespace
Classes such as Pipeline, OrdinalEncoder, and MinMaxScaler are reimplemented for distributed execution inside Snowflake
Why the other options are incorrect:
A: Uses sklearn, which runs locally and does not integrate with Snowpark ML pipelines
B: Mixes sklearn and Snowpark ML, which is not supported
C: Uses Snowpark (DataFrame API) namespaces, not Snowpark ML
Correct imports must come entirely from snowflake.ml.modeling, making Option D the correct answer.
Incorrect
Although the API looks very similar to scikit-learn, this pipeline is built using the Snowpark ML Modeling API, not sklearn or core Snowpark.
Key points:
Snowpark ML is a separate Python library from Snowpark
It lives under the snowflake.ml.modeling namespace
Classes such as Pipeline, OrdinalEncoder, and MinMaxScaler are reimplemented for distributed execution inside Snowflake
Why the other options are incorrect:
A: Uses sklearn, which runs locally and does not integrate with Snowpark ML pipelines
B: Mixes sklearn and Snowpark ML, which is not supported
C: Uses Snowpark (DataFrame API) namespaces, not Snowpark ML
Correct imports must come entirely from snowflake.ml.modeling, making Option D the correct answer.
Unattempted
Although the API looks very similar to scikit-learn, this pipeline is built using the Snowpark ML Modeling API, not sklearn or core Snowpark.
Key points:
Snowpark ML is a separate Python library from Snowpark
It lives under the snowflake.ml.modeling namespace
Classes such as Pipeline, OrdinalEncoder, and MinMaxScaler are reimplemented for distributed execution inside Snowflake
Why the other options are incorrect:
A: Uses sklearn, which runs locally and does not integrate with Snowpark ML pipelines
B: Mixes sklearn and Snowpark ML, which is not supported
C: Uses Snowpark (DataFrame API) namespaces, not Snowpark ML
Correct imports must come entirely from snowflake.ml.modeling, making Option D the correct answer.
Question 31 of 60
31. Question
You call DESC PROCEDURE on a stored procedure that you previously created and registered with Snowpark Python. Which of the following is a possible information returned for PACKAGES? (choose one)
Which of the following statements will create a Medium Snowpark-optimized warehouse? (choose one)
Correct
The WAREHOUSE_TYPE parameter (not TYPE!) which is by default STANDARD – must be set to SNOWPARK-OPTIMIZED (case-insensitive, but remark the hyphen and the required quotes). The WAREHOUSE_SIZE (not SIZE!) is by default XSMALL, but only for standard warehouses. For Snowpark-optimized warehouses it is MEDIUM. Because they do not come in XSMALL or SMALL sizes. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/create-warehouse
Incorrect
The WAREHOUSE_TYPE parameter (not TYPE!) which is by default STANDARD – must be set to SNOWPARK-OPTIMIZED (case-insensitive, but remark the hyphen and the required quotes). The WAREHOUSE_SIZE (not SIZE!) is by default XSMALL, but only for standard warehouses. For Snowpark-optimized warehouses it is MEDIUM. Because they do not come in XSMALL or SMALL sizes. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/create-warehouse
Unattempted
The WAREHOUSE_TYPE parameter (not TYPE!) which is by default STANDARD – must be set to SNOWPARK-OPTIMIZED (case-insensitive, but remark the hyphen and the required quotes). The WAREHOUSE_SIZE (not SIZE!) is by default XSMALL, but only for standard warehouses. For Snowpark-optimized warehouses it is MEDIUM. Because they do not come in XSMALL or SMALL sizes. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/create-warehouse
Question 33 of 60
33. Question
What should you do to change an active Snowpark-optimized warehouse to an X-Small standard warehouse? (choose three)
Correct
The WAREHOUSE_TYPE parameter (the warehouse type) must be changed to SNOWPARK-OPTIMIZED. And the WAREHOUSE_SIZE parameter (the warehouse size) must be changed to XSMALL or X-SMALL. Snowpark-optimized warehouses are created as MEDIUM by default, as they cannot come in XSMALL or SMALL sizes. Beware also that you must also suspend the warehouse, when changing its type: A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.. You may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Incorrect
The WAREHOUSE_TYPE parameter (the warehouse type) must be changed to SNOWPARK-OPTIMIZED. And the WAREHOUSE_SIZE parameter (the warehouse size) must be changed to XSMALL or X-SMALL. Snowpark-optimized warehouses are created as MEDIUM by default, as they cannot come in XSMALL or SMALL sizes. Beware also that you must also suspend the warehouse, when changing its type: A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.. You may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Unattempted
The WAREHOUSE_TYPE parameter (the warehouse type) must be changed to SNOWPARK-OPTIMIZED. And the WAREHOUSE_SIZE parameter (the warehouse size) must be changed to XSMALL or X-SMALL. Snowpark-optimized warehouses are created as MEDIUM by default, as they cannot come in XSMALL or SMALL sizes. Beware also that you must also suspend the warehouse, when changing its type: A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.. You may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Question 34 of 60
34. Question
A virtual warehouse was just created with:
CREATE WAREHOUSE wh1;
Which of the following will successfully change its warehouse type? (Choose one)
Correct
When a warehouse is created with default settings, it is a STANDARD, XSMALL warehouse and it starts automatically.
Key rules in Snowflake:
A warehouse must be suspended to change its WAREHOUSE_TYPE
A SNOWPARK-OPTIMIZED warehouse cannot be XSMALL or SMALL
Therefore, the correct sequence is:
Suspend the warehouse
Increase the warehouse size to at least MEDIUM
Change the warehouse type
Incorrect
When a warehouse is created with default settings, it is a STANDARD, XSMALL warehouse and it starts automatically.
Key rules in Snowflake:
A warehouse must be suspended to change its WAREHOUSE_TYPE
A SNOWPARK-OPTIMIZED warehouse cannot be XSMALL or SMALL
Therefore, the correct sequence is:
Suspend the warehouse
Increase the warehouse size to at least MEDIUM
Change the warehouse type
Unattempted
When a warehouse is created with default settings, it is a STANDARD, XSMALL warehouse and it starts automatically.
Key rules in Snowflake:
A warehouse must be suspended to change its WAREHOUSE_TYPE
A SNOWPARK-OPTIMIZED warehouse cannot be XSMALL or SMALL
Therefore, the correct sequence is:
Suspend the warehouse
Increase the warehouse size to at least MEDIUM
Change the warehouse type
Question 35 of 60
35. Question
When should you use a Snowflake-optimized warehouse? (choose two)
Correct
While Snowpark workloads can be run on both standard and Snowpark-optimized warehouses, Snowpark-optimized warehouses are recommended for running code, and recommended workloads that have large memory requirements or dependencies on a specific CPU architecture. Example workloads include ML training use cases using a stored procedure on a single virtual warehouse node. Snowpark workloads, utilizing UDF or UDTF, might also benefit from Snowpark-optimized warehouses. (see link below). Snowpark-optimized warehouses have been introduced especially to train ML models on large data stored in Snowflake tables. To serve ML models, Snowflake rather recommends vectorized UDFs on multi-node standard warehouses. To improve the performance of complex SQL queries you should rather try to resize a standard warehouse. Also, Snowpark-optimized warehouses are more expensive than standard warehouses with the same size. Remember the ML training use case, as it is very typical for this kind of expected exam question! References: https://docs.snowflake.com/en/user-guide/warehouses-snowpark-optimized
Incorrect
While Snowpark workloads can be run on both standard and Snowpark-optimized warehouses, Snowpark-optimized warehouses are recommended for running code, and recommended workloads that have large memory requirements or dependencies on a specific CPU architecture. Example workloads include ML training use cases using a stored procedure on a single virtual warehouse node. Snowpark workloads, utilizing UDF or UDTF, might also benefit from Snowpark-optimized warehouses. (see link below). Snowpark-optimized warehouses have been introduced especially to train ML models on large data stored in Snowflake tables. To serve ML models, Snowflake rather recommends vectorized UDFs on multi-node standard warehouses. To improve the performance of complex SQL queries you should rather try to resize a standard warehouse. Also, Snowpark-optimized warehouses are more expensive than standard warehouses with the same size. Remember the ML training use case, as it is very typical for this kind of expected exam question! References: https://docs.snowflake.com/en/user-guide/warehouses-snowpark-optimized
Unattempted
While Snowpark workloads can be run on both standard and Snowpark-optimized warehouses, Snowpark-optimized warehouses are recommended for running code, and recommended workloads that have large memory requirements or dependencies on a specific CPU architecture. Example workloads include ML training use cases using a stored procedure on a single virtual warehouse node. Snowpark workloads, utilizing UDF or UDTF, might also benefit from Snowpark-optimized warehouses. (see link below). Snowpark-optimized warehouses have been introduced especially to train ML models on large data stored in Snowflake tables. To serve ML models, Snowflake rather recommends vectorized UDFs on multi-node standard warehouses. To improve the performance of complex SQL queries you should rather try to resize a standard warehouse. Also, Snowpark-optimized warehouses are more expensive than standard warehouses with the same size. Remember the ML training use case, as it is very typical for this kind of expected exam question! References: https://docs.snowflake.com/en/user-guide/warehouses-snowpark-optimized
Question 36 of 60
36. Question
Which parameter allows you to modify the memory resources and CPU architecture for a Snowpark-optimized warehouse? (choose one)
How can you create a Snowpark session in your Python application? (choose three)
Correct
The typical way to programmatically connect to Snowflake with a Snowpark session is to call Session.builder.configs(params).create(). The parameters such as account, user, password, warehouse, role, database, schema etc – are passed through a Python JSON dictionary. A Snowpark session is also a wrapper around a Snowflake Connector for Python connection, but this is the old legacy client-side way to connect to Snowflake. The get_active_session global function takes no parameters and returns the current session to Snowflake, when you are already connected to such an account. Youll have many questions about connecting to Snowflake through a Snowflake session. So know in and out the different use cases and possible connection parameters. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.context.get_active_session
Incorrect
The typical way to programmatically connect to Snowflake with a Snowpark session is to call Session.builder.configs(params).create(). The parameters such as account, user, password, warehouse, role, database, schema etc – are passed through a Python JSON dictionary. A Snowpark session is also a wrapper around a Snowflake Connector for Python connection, but this is the old legacy client-side way to connect to Snowflake. The get_active_session global function takes no parameters and returns the current session to Snowflake, when you are already connected to such an account. Youll have many questions about connecting to Snowflake through a Snowflake session. So know in and out the different use cases and possible connection parameters. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.context.get_active_session
Unattempted
The typical way to programmatically connect to Snowflake with a Snowpark session is to call Session.builder.configs(params).create(). The parameters such as account, user, password, warehouse, role, database, schema etc – are passed through a Python JSON dictionary. A Snowpark session is also a wrapper around a Snowflake Connector for Python connection, but this is the old legacy client-side way to connect to Snowflake. The get_active_session global function takes no parameters and returns the current session to Snowflake, when you are already connected to such an account. Youll have many questions about connecting to Snowflake through a Snowflake session. So know in and out the different use cases and possible connection parameters. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.23.0/snowpark/api/snowflake.snowpark.context.get_active_session
Question 38 of 60
38. Question
Which of the following is a required Snowpark session parameter, when connecting with a key pair? (choose one)
You try to establish a connection using the Snowflake Connector for Python and key-pair authentication. Which of the following are required connection parameters? (choose two)
Correct
You must use account, not host. And a password is not required for key pair authentication. The role and warehouse are not required and are provided by default. A database and schema are optional. Check the link below, that shows how to use private_key_file and private_key_file_pwd for a local serialized private key. Set the private_key_file parameter in the connect function to the path to the private key file. Also, set the private_key_file_pwd parameter to the passphrase of the private key file. Beware also that there are differences with a Snowpark session, which may require just a private_key parameter. Here you are required to use the Snowflake Connector for Python, which is different. References: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#using-key-pair-authentication-and-key-pair-rotation
Incorrect
You must use account, not host. And a password is not required for key pair authentication. The role and warehouse are not required and are provided by default. A database and schema are optional. Check the link below, that shows how to use private_key_file and private_key_file_pwd for a local serialized private key. Set the private_key_file parameter in the connect function to the path to the private key file. Also, set the private_key_file_pwd parameter to the passphrase of the private key file. Beware also that there are differences with a Snowpark session, which may require just a private_key parameter. Here you are required to use the Snowflake Connector for Python, which is different. References: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#using-key-pair-authentication-and-key-pair-rotation
Unattempted
You must use account, not host. And a password is not required for key pair authentication. The role and warehouse are not required and are provided by default. A database and schema are optional. Check the link below, that shows how to use private_key_file and private_key_file_pwd for a local serialized private key. Set the private_key_file parameter in the connect function to the path to the private key file. Also, set the private_key_file_pwd parameter to the passphrase of the private key file. Beware also that there are differences with a Snowpark session, which may require just a private_key parameter. Here you are required to use the Snowflake Connector for Python, which is different. References: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-connect#using-key-pair-authentication-and-key-pair-rotation
Question 40 of 60
40. Question
You pass a Python dictionary of connection parameters to Session.builder.config(params).create(). However, the call fails. What was the problem? (choose one)
Which of the following parameters are optional, when you connect to Snowflake with a Snowpark session? (choose three)
Correct
You see the password here and no other specific parameters for another type of authentication than the password-based authentication. In which care account, user and password are all three required. The rest are what we call session context parameters. A role and a warehouse are almost always required, but they can be provided by default, from those associated to the current user. A warehouse is almost always required, to be able to execute most SQL statements. A database and a schema are optional. You may avoid using fully-qualified names for schema objects if you provide them. But they are not mandatory. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session#connect-by-specifying-connection-parameters https://medium.com/snowflake/master-snowflake-session-parameters-in-snowpark-fcadcd11035b
Incorrect
You see the password here and no other specific parameters for another type of authentication than the password-based authentication. In which care account, user and password are all three required. The rest are what we call session context parameters. A role and a warehouse are almost always required, but they can be provided by default, from those associated to the current user. A warehouse is almost always required, to be able to execute most SQL statements. A database and a schema are optional. You may avoid using fully-qualified names for schema objects if you provide them. But they are not mandatory. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session#connect-by-specifying-connection-parameters https://medium.com/snowflake/master-snowflake-session-parameters-in-snowpark-fcadcd11035b
Unattempted
You see the password here and no other specific parameters for another type of authentication than the password-based authentication. In which care account, user and password are all three required. The rest are what we call session context parameters. A role and a warehouse are almost always required, but they can be provided by default, from those associated to the current user. A warehouse is almost always required, to be able to execute most SQL statements. A database and a schema are optional. You may avoid using fully-qualified names for schema objects if you provide them. But they are not mandatory. References: https://docs.snowflake.com/en/developer-guide/snowpark/python/creating-session#connect-by-specifying-connection-parameters https://medium.com/snowflake/master-snowflake-session-parameters-in-snowpark-fcadcd11035b
Question 42 of 60
42. Question
Which is a type of parameter you can pass to the Session.builder.config function? (choose one)
What is returned by the following query, which you can execute in a SQL worksheet in the Snowflake web UI? (Choose one)
SELECT DISTINCT $4 FROM snowflake.information_schema.packages WHERE $3 = ‘python‘;
Correct
The SNOWFLAKE.INFORMATION_SCHEMA.PACKAGES system view returns package-related metadata in the following order:
Package name
Package version
Language
Installed runtime version
By filtering on $3 = ‘python‘, the query restricts results to Python-based packages. Selecting DISTINCT $4 therefore returns the Python runtime versions associated with the supported Snowpark packages.
This view is especially important for understanding Snowpark package and runtime version management in Snowflake.
Incorrect
The SNOWFLAKE.INFORMATION_SCHEMA.PACKAGES system view returns package-related metadata in the following order:
Package name
Package version
Language
Installed runtime version
By filtering on $3 = ‘python‘, the query restricts results to Python-based packages. Selecting DISTINCT $4 therefore returns the Python runtime versions associated with the supported Snowpark packages.
This view is especially important for understanding Snowpark package and runtime version management in Snowflake.
Unattempted
The SNOWFLAKE.INFORMATION_SCHEMA.PACKAGES system view returns package-related metadata in the following order:
Package name
Package version
Language
Installed runtime version
By filtering on $3 = ‘python‘, the query restricts results to Python-based packages. Selecting DISTINCT $4 therefore returns the Python runtime versions associated with the supported Snowpark packages.
This view is especially important for understanding Snowpark package and runtime version management in Snowflake.
Question 46 of 60
46. Question
An active WH1 Snowflake warehouse has been created with the default parameter values. Which of the following statements will fail? (choose one)
Correct
CREATE WAREHOUSE wh1, with the default parameter values, will create a Standard X-Small virtual warehouse. Trying to change it to …X-Small again will not generate an error. And you can specify XSMALL or X-SMALL (the hyphen requires quotes). The only time a warehouse must be suspended is when changing its type (A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.). Setting a scaling policy makes sense only for a multi-cluster warehouse, but the statement will still not fail. Finally, setting auto-suspend to 0 is not recommended because the warehouse will never suspend in fact but the SQL statement is still valid. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Incorrect
CREATE WAREHOUSE wh1, with the default parameter values, will create a Standard X-Small virtual warehouse. Trying to change it to …X-Small again will not generate an error. And you can specify XSMALL or X-SMALL (the hyphen requires quotes). The only time a warehouse must be suspended is when changing its type (A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.). Setting a scaling policy makes sense only for a multi-cluster warehouse, but the statement will still not fail. Finally, setting auto-suspend to 0 is not recommended because the warehouse will never suspend in fact but the SQL statement is still valid. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Unattempted
CREATE WAREHOUSE wh1, with the default parameter values, will create a Standard X-Small virtual warehouse. Trying to change it to …X-Small again will not generate an error. And you can specify XSMALL or X-SMALL (the hyphen requires quotes). The only time a warehouse must be suspended is when changing its type (A warehouse does not need to be suspended to set or change any of its properties, except for type. To change the warehouse type, the warehouse must be in the suspended state.). Setting a scaling policy makes sense only for a multi-cluster warehouse, but the statement will still not fail. Finally, setting auto-suspend to 0 is not recommended because the warehouse will never suspend in fact but the SQL statement is still valid. Beware you may get some tricky questions about creating or altering Snowflake warehouses! References: https://docs.snowflake.com/en/sql-reference/sql/alter-warehouse
Question 47 of 60
47. Question
What happens when you deploy a stored proc through Snowpark Python using a supported Anaconda package but with an unspecified version (choose two)
You register and execute a stored proc with Snowpark Python from a local environment. You have a more recent version of one of your packages than any other version deployed for that supported package. What will happen at runtime? (choose one)
You installed Snowpark Python in your local virtual environment a while ago, with pip. A new version became available. How do you install it? (choose one)
Correct
It may look like rather basic Python knowledge, but setting up a development environment for Snowpark is a requirement for this exam and you may get quite a few questions about pip and conda. In this case, here, this is the typical way you upgrade any other Python package to a new version, using pip. References: https://stackoverflow.com/questions/47071256/how-to-update-upgrade-a-package-using-pip
Incorrect
It may look like rather basic Python knowledge, but setting up a development environment for Snowpark is a requirement for this exam and you may get quite a few questions about pip and conda. In this case, here, this is the typical way you upgrade any other Python package to a new version, using pip. References: https://stackoverflow.com/questions/47071256/how-to-update-upgrade-a-package-using-pip
Unattempted
It may look like rather basic Python knowledge, but setting up a development environment for Snowpark is a requirement for this exam and you may get quite a few questions about pip and conda. In this case, here, this is the typical way you upgrade any other Python package to a new version, using pip. References: https://stackoverflow.com/questions/47071256/how-to-update-upgrade-a-package-using-pip
Question 50 of 60
50. Question
You registered a stored procedure with a supported Python package, but without specifying a version. How can you quickly check which version has been deployed, for that package and its eventual dependencies? (choose one)
Correct
DESC[RIBE] PROCEDURE will also show you a row with INSTALLED_PACKAGES, where all deployed packages, including the dependencies, are displayed with their versions. The PACKAGES row will show whatever you sent. If you did not specify package versions, they will not be listed here. SHOW PROCEDURES does not display such a level of detail. And the PACKAGES system view is for all supported packages. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages#displaying-imported-packages
Incorrect
DESC[RIBE] PROCEDURE will also show you a row with INSTALLED_PACKAGES, where all deployed packages, including the dependencies, are displayed with their versions. The PACKAGES row will show whatever you sent. If you did not specify package versions, they will not be listed here. SHOW PROCEDURES does not display such a level of detail. And the PACKAGES system view is for all supported packages. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages#displaying-imported-packages
Unattempted
DESC[RIBE] PROCEDURE will also show you a row with INSTALLED_PACKAGES, where all deployed packages, including the dependencies, are displayed with their versions. The PACKAGES row will show whatever you sent. If you did not specify package versions, they will not be listed here. SHOW PROCEDURES does not display such a level of detail. And the PACKAGES system view is for all supported packages. References: https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages#displaying-imported-packages
Question 51 of 60
51. Question
As a Snowpark Specialist, you write and execute the code below, on a file you previously uploaded in a named stage: schema = StructType([ StructField(“EMPLOYEE_ID“, IntegerType()), StructField(“FIRST_NAME“, StringType()), StructField(“LAST_NAME“, StringType())]) df = session.read.schema(schema ).options({“field_delimiter“: “,“, “skip_header“: 1} ).csv(‘@my_stage/my_file.csv‘) df.show(3) Assuming all the import statements have been properly executed on the snowflake.snowpark namespaces, what will happen behind upon the execution of that code? (choose two)
Correct
The lazy evaluation will be postponed until show is called. Upon this call, a SELECT SQL statement will be constructed on the fly, which will query directly the lines from the CSV file. The query will also limit directly all returned data to the top 3 entries and nothing more. While querying a CSV file with options, a temporary file format dropped at the end is likely to be created. The options call includes a skip_header: 1, which tells us that the column definitions will be rather provided in an explicit manner, from the schema structure. The show method renders the result on screen in a friendly textual manner. The collect method will return a collection of Row objects instead. References: https://thinketl.com/how-to-create-dataframes-in-snowflake-snowpark/#11_How_to_create_a_DataFrame_in_Snowpark_by_reading_files_from_a_stage
Incorrect
The lazy evaluation will be postponed until show is called. Upon this call, a SELECT SQL statement will be constructed on the fly, which will query directly the lines from the CSV file. The query will also limit directly all returned data to the top 3 entries and nothing more. While querying a CSV file with options, a temporary file format dropped at the end is likely to be created. The options call includes a skip_header: 1, which tells us that the column definitions will be rather provided in an explicit manner, from the schema structure. The show method renders the result on screen in a friendly textual manner. The collect method will return a collection of Row objects instead. References: https://thinketl.com/how-to-create-dataframes-in-snowflake-snowpark/#11_How_to_create_a_DataFrame_in_Snowpark_by_reading_files_from_a_stage
Unattempted
The lazy evaluation will be postponed until show is called. Upon this call, a SELECT SQL statement will be constructed on the fly, which will query directly the lines from the CSV file. The query will also limit directly all returned data to the top 3 entries and nothing more. While querying a CSV file with options, a temporary file format dropped at the end is likely to be created. The options call includes a skip_header: 1, which tells us that the column definitions will be rather provided in an explicit manner, from the schema structure. The show method renders the result on screen in a friendly textual manner. The collect method will return a collection of Row objects instead. References: https://thinketl.com/how-to-create-dataframes-in-snowflake-snowpark/#11_How_to_create_a_DataFrame_in_Snowpark_by_reading_files_from_a_stage
Question 52 of 60
52. Question
What types of files that do not accept a schema can you load into a DataFrame using a DataFrameReader? (choose one)
Correct
You can specify the schema of the data that you plan to load by constructing a types.StructType object and passing it to the schema method if the file format is CSV. Other file formats such as JSON, XML, Parquet, ORC, and AVRO dont accept a schema. YAML is not a file format supported by the Snowpark I/O operations. You must know what type of files you can upload and unload from Snowflake, and with the Snowpark Python classes in particular. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Incorrect
You can specify the schema of the data that you plan to load by constructing a types.StructType object and passing it to the schema method if the file format is CSV. Other file formats such as JSON, XML, Parquet, ORC, and AVRO dont accept a schema. YAML is not a file format supported by the Snowpark I/O operations. You must know what type of files you can upload and unload from Snowflake, and with the Snowpark Python classes in particular. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Unattempted
You can specify the schema of the data that you plan to load by constructing a types.StructType object and passing it to the schema method if the file format is CSV. Other file formats such as JSON, XML, Parquet, ORC, and AVRO dont accept a schema. YAML is not a file format supported by the Snowpark I/O operations. You must know what type of files you can upload and unload from Snowflake, and with the Snowpark Python classes in particular. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Question 53 of 60
53. Question
The following line will load a Parquet file with automatically inferring the schema: df = session.read.parquet(“@mystage/test.parquet“ ).where(col(“num“) == 2) How can you load the file without inferring the schema? (choose one)?
Correct
This could could become: df = session.read.option(“infer_schema“, False).parquet(“@mystage/test.parquet“).where(col(“num“) == 2) The schema method can be called only for CSV files. Pay attention how you get a DataFrameReader with the reader method from a DataFrame, then you call fluently methods like schema or options, and end-up with another DataFrame returned by csv, parquet, json etc. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Incorrect
This could could become: df = session.read.option(“infer_schema“, False).parquet(“@mystage/test.parquet“).where(col(“num“) == 2) The schema method can be called only for CSV files. Pay attention how you get a DataFrameReader with the reader method from a DataFrame, then you call fluently methods like schema or options, and end-up with another DataFrame returned by csv, parquet, json etc. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Unattempted
This could could become: df = session.read.option(“infer_schema“, False).parquet(“@mystage/test.parquet“).where(col(“num“) == 2) The schema method can be called only for CSV files. Pay attention how you get a DataFrameReader with the reader method from a DataFrame, then you call fluently methods like schema or options, and end-up with another DataFrame returned by csv, parquet, json etc. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Question 54 of 60
54. Question
As a Snowpark Specialist, you upload a JSON file in a data frame from an internal stage: df = session.read.json(“@mystage/test.json“) Assuming all import statements have been provided and executed for any snowflake.snowpark dependencies, how can you return a list of Rows with top JSON objects with the name property John? (choose one)
Correct
Only collect will return a list of Row objects, show renders the result in text mode on screen. While JSON does not have an explicit schema, you may refer to the only column with col($1), then use the key dictionary notation to refer to individual properties. The filter and where methods are synonyms. Remark also the usage of lit, because you compare column values. Youll likely get a lot of questions about drilling-down on JSON semi-structured data. They may be usually combined with more complex calls and point to other different topics. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Incorrect
Only collect will return a list of Row objects, show renders the result in text mode on screen. While JSON does not have an explicit schema, you may refer to the only column with col($1), then use the key dictionary notation to refer to individual properties. The filter and where methods are synonyms. Remark also the usage of lit, because you compare column values. Youll likely get a lot of questions about drilling-down on JSON semi-structured data. They may be usually combined with more complex calls and point to other different topics. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Unattempted
Only collect will return a list of Row objects, show renders the result in text mode on screen. While JSON does not have an explicit schema, you may refer to the only column with col($1), then use the key dictionary notation to refer to individual properties. The filter and where methods are synonyms. Remark also the usage of lit, because you compare column values. Youll likely get a lot of questions about drilling-down on JSON semi-structured data. They may be usually combined with more complex calls and point to other different topics. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.4.0/api/snowflake.snowpark.DataFrameReader
Question 55 of 60
55. Question
You have a Snowpark DataFrame of employees with the columns EMP_NAME, SALARY, and DEPT_ID.
Add a column that contains the maximum salary between the current employee and the next employee from the same department, with employees ordered by name. The final result should be sorted by employee name regardless of department. (Choose one)
Correct
The requirement is to add a new column, not replace or reduce the DataFrame. This immediately points to using withColumn, which makes Options B and D invalid.
A window function is required because the result is not a grouped aggregation; instead, it computes an inline aggregation per row. This is exactly what window functions are designed for.
Key points that make Option A correct:
partitionBy(“DEPT_ID“) ensures employees are grouped by department
orderBy(col(“EMP_NAME“)) sorts employees by name within each department
rows_between(Window.currentRow, 1) defines a window that includes:
the current row
the next row (by name) in the same department
max(“SALARY“).over(ws) computes the maximum salary across those two rows
.sort(“EMP_NAME“) ensures the final output is sorted globally by employee name, independent of department
Why the others fail:
range_between operates on value ranges, not row positions
Invalid row boundaries (negative or reversed ranges) do not match the requirement
Misplaced .over() or incorrect use of select instead of withColumn
This pattern closely mirrors SQL window functions and is commonly tested in Snowpark-related questions.
Incorrect
The requirement is to add a new column, not replace or reduce the DataFrame. This immediately points to using withColumn, which makes Options B and D invalid.
A window function is required because the result is not a grouped aggregation; instead, it computes an inline aggregation per row. This is exactly what window functions are designed for.
Key points that make Option A correct:
partitionBy(“DEPT_ID“) ensures employees are grouped by department
orderBy(col(“EMP_NAME“)) sorts employees by name within each department
rows_between(Window.currentRow, 1) defines a window that includes:
the current row
the next row (by name) in the same department
max(“SALARY“).over(ws) computes the maximum salary across those two rows
.sort(“EMP_NAME“) ensures the final output is sorted globally by employee name, independent of department
Why the others fail:
range_between operates on value ranges, not row positions
Invalid row boundaries (negative or reversed ranges) do not match the requirement
Misplaced .over() or incorrect use of select instead of withColumn
This pattern closely mirrors SQL window functions and is commonly tested in Snowpark-related questions.
Unattempted
The requirement is to add a new column, not replace or reduce the DataFrame. This immediately points to using withColumn, which makes Options B and D invalid.
A window function is required because the result is not a grouped aggregation; instead, it computes an inline aggregation per row. This is exactly what window functions are designed for.
Key points that make Option A correct:
partitionBy(“DEPT_ID“) ensures employees are grouped by department
orderBy(col(“EMP_NAME“)) sorts employees by name within each department
rows_between(Window.currentRow, 1) defines a window that includes:
the current row
the next row (by name) in the same department
max(“SALARY“).over(ws) computes the maximum salary across those two rows
.sort(“EMP_NAME“) ensures the final output is sorted globally by employee name, independent of department
Why the others fail:
range_between operates on value ranges, not row positions
Invalid row boundaries (negative or reversed ranges) do not match the requirement
Misplaced .over() or incorrect use of select instead of withColumn
This pattern closely mirrors SQL window functions and is commonly tested in Snowpark-related questions.
Question 56 of 60
56. Question
You have a Snowpark DataFrame of employees with the columns EMP_NAME, SALARY, and DEPT_ID. Keep only the employees with the highest salary per department. (Choose one)
Correct
To keep the highest-paid employee per department, you must use a window function, just like in SQL.
Key points that make Option D correct:
row_number() is a window function from snowflake.snowpark.functions
The window specification:
partition_by(“DEPT_ID“) groups employees by department
order_by(col(“SALARY“).desc()) ranks employees from highest to lowest salary within each department
Filtering on RANK == 1 keeps only the top-paid employee per department
Why the others are incorrect:
A: call_udf is not used for built-in window functions
B: select would drop other required columns instead of appending one
C: over(ws) must be applied directly to the window function, not after with_column
This pattern is a direct parallel to SQL window functions and is commonly tested in Snowpark certification exams.
Question 57 of 60
57. Question
You have two source and target (s and t) Snowpark DataFrames, both with NAME and SAL columns. You need to update t with the SAL value from s, when the rows have the same NAME values. Or insert new rows in t when there is no such match. (choose one)
Correct
This is all about the DataFrame merge method in Snowpark, remark how its done. This will basically combine data from two DataFrames. You have to define the matching condition, the way you provide it for filter/where. Then you usually provide an array with two elements: one for the matching situation (when you have to UPDATE), and one for no match (when you have an INSERT). when_matched() and when_not_matched() are global contextual functions provided specifically for MERGE. The MERGE statement in SQL is rather different. Get used with the Snowpark DataFrame syntax, as you never know when you may have one or two questions with this. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.9.0/api/snowflake.snowpark.Table.merge https://thinketl.com/how-to-merge-two-dataframes-in-snowflake-snowpark/
Incorrect
This is all about the DataFrame merge method in Snowpark, remark how its done. This will basically combine data from two DataFrames. You have to define the matching condition, the way you provide it for filter/where. Then you usually provide an array with two elements: one for the matching situation (when you have to UPDATE), and one for no match (when you have an INSERT). when_matched() and when_not_matched() are global contextual functions provided specifically for MERGE. The MERGE statement in SQL is rather different. Get used with the Snowpark DataFrame syntax, as you never know when you may have one or two questions with this. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.9.0/api/snowflake.snowpark.Table.merge https://thinketl.com/how-to-merge-two-dataframes-in-snowflake-snowpark/
Unattempted
This is all about the DataFrame merge method in Snowpark, remark how its done. This will basically combine data from two DataFrames. You have to define the matching condition, the way you provide it for filter/where. Then you usually provide an array with two elements: one for the matching situation (when you have to UPDATE), and one for no match (when you have an INSERT). when_matched() and when_not_matched() are global contextual functions provided specifically for MERGE. The MERGE statement in SQL is rather different. Get used with the Snowpark DataFrame syntax, as you never know when you may have one or two questions with this. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/1.9.0/api/snowflake.snowpark.Table.merge https://thinketl.com/how-to-merge-two-dataframes-in-snowflake-snowpark/
Question 58 of 60
58. Question
What will the following sequence do? (Choose one)
session.sql(“select * from table1“).drop_table()
Correct
session.sql() returns a DataFrame, not a Table object.
In Snowpark:
Table is a specialization of DataFrame that represents a physical Snowflake table
DDL/DML methods such as drop_table(), delete(), update(), and merge() exist only on Table, not on DataFrame
Since drop_table() is not a DataFrame method, calling it on the result of session.sql() causes a failure.
To drop a table, you must first obtain a Table object, for example:
session.table(“table1“).drop_table()
Be careful when working with Snowpark objects: although a Table inherits from DataFrame, not every DataFrame represents a Snowflake table, and only Table supports table-level DDL operations.
Incorrect
session.sql() returns a DataFrame, not a Table object.
In Snowpark:
Table is a specialization of DataFrame that represents a physical Snowflake table
DDL/DML methods such as drop_table(), delete(), update(), and merge() exist only on Table, not on DataFrame
Since drop_table() is not a DataFrame method, calling it on the result of session.sql() causes a failure.
To drop a table, you must first obtain a Table object, for example:
session.table(“table1“).drop_table()
Be careful when working with Snowpark objects: although a Table inherits from DataFrame, not every DataFrame represents a Snowflake table, and only Table supports table-level DDL operations.
Unattempted
session.sql() returns a DataFrame, not a Table object.
In Snowpark:
Table is a specialization of DataFrame that represents a physical Snowflake table
DDL/DML methods such as drop_table(), delete(), update(), and merge() exist only on Table, not on DataFrame
Since drop_table() is not a DataFrame method, calling it on the result of session.sql() causes a failure.
To drop a table, you must first obtain a Table object, for example:
session.table(“table1“).drop_table()
Be careful when working with Snowpark objects: although a Table inherits from DataFrame, not every DataFrame represents a Snowflake table, and only Table supports table-level DDL operations.
Question 59 of 60
59. Question
Add a calculated DataFrame STATUS column that displays GOOD or BAD, based on the positive or negative value from a BALANCE column. (choose one)
A table of employees has EMP_ID and DEPT_ID columns. Display on screen the departments with more than one employee. (choose one)
Correct
There is no separate HAVING function in Snowpark DataFrame: you must call the same filter/where function, but after the group_by, with the condition applied on the aggregate result. As you can see here, you must first provide an alias for the aggregate calculation, to be later able to use it in the filtered condition. One important think to remember, when using grouping and aggregates in Snowpark, is that you need the call to group_by before agg. And all aggregate functions (usually defined in snowflake.snowpark.functions) should be called inside agg. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/latest/snowpark/api/snowflake.snowpark.DataFrame.groupBy https://thinketl.com/group-by-in-snowflake-snowpark/
Incorrect
There is no separate HAVING function in Snowpark DataFrame: you must call the same filter/where function, but after the group_by, with the condition applied on the aggregate result. As you can see here, you must first provide an alias for the aggregate calculation, to be later able to use it in the filtered condition. One important think to remember, when using grouping and aggregates in Snowpark, is that you need the call to group_by before agg. And all aggregate functions (usually defined in snowflake.snowpark.functions) should be called inside agg. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/latest/snowpark/api/snowflake.snowpark.DataFrame.groupBy https://thinketl.com/group-by-in-snowflake-snowpark/
Unattempted
There is no separate HAVING function in Snowpark DataFrame: you must call the same filter/where function, but after the group_by, with the condition applied on the aggregate result. As you can see here, you must first provide an alias for the aggregate calculation, to be later able to use it in the filtered condition. One important think to remember, when using grouping and aggregates in Snowpark, is that you need the call to group_by before agg. And all aggregate functions (usually defined in snowflake.snowpark.functions) should be called inside agg. References: https://docs.snowflake.com/en/developer-guide/snowpark/reference/python/latest/snowpark/api/snowflake.snowpark.DataFrame.groupBy https://thinketl.com/group-by-in-snowflake-snowpark/
X
Use Page numbers below to navigate to other practice tests