Skip to content

Conversation

mchades
Copy link
Contributor

@mchades mchades commented Dec 18, 2023

What changes were proposed in this pull request?

  • set hdfs as HDFS superuser group in container
  • use datastrato as the Hive catalog Integration Test user instead of root

Why are the changes needed?

we should not use the root user directly to access HDFS

Fix: #554

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing ITs

@mchades mchades requested review from jerryshao and xunliu December 18, 2023 11:45
@mchades mchades self-assigned this Dec 18, 2023
@mchades
Copy link
Contributor Author

mchades commented Dec 18, 2023

This PR, based on #620, adds owner validation for #620 (review).

@xunliu Please review it again, thanks!

xunliu
xunliu previously approved these changes Dec 19, 2023
Copy link
Member

@xunliu xunliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


HADOOP_PACKAGE_NAME="hadoop-${HADOOP_VERSION}.tar.gz" # Must export this variable for Dockerfile
HADOOP_DOWNLOAD_URL="http://archive.apache.org/dist/hadoop/core/hadoop-${HADOOP_VERSION}/${HADOOP_PACKAGE_NAME}"

HIVE_PACKAGE_NAME="apache-hive-${HIVE_VERSION}-bin.tar.gz" # Must export this variable for Dockerfile
HIVE_DOWNLOAD_URL="https://archive.apache.org/dist/hive/hive-${HIVE_VERSION}/${HIVE_PACKAGE_NAME}"

JDBC_DIVER_PACKAGE_NAME="mysql-connector-java-${MYSQL_JDBC_DRIVER_VERSION}.tar.gz" # Must export this variable for Dockerfile
JDBC_DIVER_DOWNLOAD_URL="https://downloads.mysql.com/archives/get/p/3/file/${JDBC_DIVER_PACKAGE_NAME}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use this version? We used another version in JDBC integration tests and playground.
https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.27/mysql-connector-java-8.0.27.jar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been using this version from the beginning(see following ref) in Hive-CI-Docker, but the JDBC integration tests and the playground introduced later are inconsistent with the previous version.

https://github.com/datastrato/gravitino/blob/dbdc1c9a6cbf31cff9d56c2e93a8bf8c1a9b13cd/dev/docker/hive/Dockerfile#L131-L132

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we don't need to keep consistent in this pr. I guess we can change them to a consistent version in later pr.

@qqqttt123
Copy link
Contributor

Will you raise a follow-up pr to modify the Docker image which integration-test uses and the document?

@mchades
Copy link
Contributor Author

mchades commented Dec 19, 2023

Will you raise a follow-up pr to modify the Docker image which integration-test uses and the document?

The Docker image datastrato/gravitino-ci-hive:0.1.7 was pushed to the DockerHub. The integration test image that the PR currently uses and passes is the latest.

The document also modified in this PR(see file docs/docker-image-details.md)

Copy link
Contributor

@qqqttt123 qqqttt123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jerryshao
Copy link
Contributor

Do you also need to ship a new hive image for users? cc @qqqttt123

@qqqttt123
Copy link
Contributor

Do you also need to ship a new hive image for users? cc @qqqttt123

No. This image is used by developers.

@jerryshao jerryshao merged commit 9dddd0c into apache:main Dec 19, 2023
@mchades mchades deleted the issue-554 branch November 22, 2024 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] User hive cannot access /tmp directory in IT container
4 participants